Quantcast
Channel: Hot Weekly Questions - Web Applications Stack Exchange
Viewing all articles
Browse latest Browse all 9786

Extracting hashtags from a cell - fixing a 99% working regex formula

$
0
0

I'm trying to use this formula to extract hashtags from a cell. I got it from this very old topic.

=trim(regexreplace(A2, "((^|\s)[^#]\S*)|([^#\w\s]\S*)", ""))

Example source:

"This is a line before a line break.#missinghashtag #2ndhashtag #3rdhashtag"

Current outcome:

#2ndhashtag #3rdhashtag

Desired outcome:

#missinghashtag #2ndhashtag #3rdhashtag

It mostly works, but I have found a small issue that I'm unable to the debug as I'm clueless about REGEX.

It's skipping the first hashtag of a new paragraph. I guess it tries to find words that start with #, but a "break" counts as an invisible character before the #, so what happens is that the first hashtag doesn't get extracted. If I add a space after the break and before the first #, it suddenly works.

I only need to fix that. It would be highly appreciated.

Edit:Actually I tested some more and if I add a hashtag before the break then something strange happens. The earlier conflictive hashtag works, the one before the break works too, but they are returned with the break in between, which ideally I'd also like to remove.

Example source:

"This is a line before a line break. #thishashtagworks#missinghashtag #2ndhashtag #3rdhashtag"

Current outcome:

"#thishashtagworks#missinghashtag #2ndhashtag #3rdhashtag"

Desired outcome:

#thishashtagworks #missinghashtag #2ndhashtag #3rdhashtag

Viewing all articles
Browse latest Browse all 9786

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>