In sheets, say I have this:
| Word | Onset |
|---|---|
| bɐ.ˈɾi.ɕiɾ | |
| ˈbɑː.tiɾ | |
| ˌbɛs.bɛ.ˈtɾɑɪ̯.ɕe |
My regexextract in Onset looks like this:
=REGEXEXTRACT(A109,"[.ʔ]ˈ?ˌ?([^ɐɛiouɪɑʊɔ]+)[ɐɛiouɪɑʊɔ]")
Aka: Extract the non-vowels that are preceded by a . (and an optional ˈ or ˌ) and followed by vowels.
I want to extract all instances within a word--this catches only the t in ˈbɑː.tiɾ and throws a #N/A with the others, and if I add another one:
=REGEXEXTRACT(A109,"[.ʔ]ˈ?ˌ?([^ɐɛiouɪɑʊɔ]+)[ɐɛiouɪɑʊɔ]+[.ʔ]ˈ?ˌ?([^ɐɛiouɪɑʊɔ]+)[ɐɛiouɪɑʊɔ]+")
This will only catch the ɾ and ɕ within bɐ.ˈɾi.ɕiɾ but throw a #N/A for the rest.
I've tried clustering and adding a plus like so:=REGEXEXTRACT(A99,"([.ʔ]ˈ?ˌ?([^ɐɛiouɪɑʊɔ]+)[ɐɛiouɪɑʊɔ])+")
The the result I get is this:
| Word | Onset | |
|---|---|---|
| bɐ.ˈɾi.ɕiɾ | .ɕi | ɕ |
Which, uh, not what I wanted.
(and I tried the {A, B} thing, and adding the second set with a question mark...)
So what I'm looking for is
| Word | Onset | ||
|---|---|---|---|
| bɐ.ˈɾi.ɕiɾ | ɾ | ɕ | |
| ˈbɑː.tiɾ | t | ||
| ˌbɛs.bɛ.ˈtɾɑɪ̯.ɕe | b, tr, ɕ | tr | ɕ |
Or maybe
| Word | Onset |
|---|---|
| bɐ.ˈɾi.ɕiɾ | ɾ, ɕ |
| ˈbɑː.tiɾ | t |
| ˌbɛs.bɛ.ˈtɾɑɪ̯.ɕe | b, tr, ɕ |
I've tried (Oh, my god, have I tried) to look for a solution, but ultimately I don't understand regex well enough, to begin with, to properly apply the suggested solutions I've found... I'm hoping someone will bend this out of iron rail for me.