In my previous blog post, I wrote about the then-new M function, Text.ReplaceAll. Unfortunately, this function is not available anymore. A certain someone from Microsoft contacted me and said that this function was made visible by mistake; it was part of a certified connector and should not have been exposed. They were kind enough to provide the actual code behind the function though. I was quite impressed by the way the function was written, and I decided to run a competition (with a prize!) to write the improved version of the function in the shortest possible way.
To recap, the function performed multiple text replacements, which was exactly what Chris Webb‘s and Ivan Bondarenko‘s functions also did. Imke Feldmann‘s implementation of a similar function was different: it replaced whole words only.
In this competition, contestants need to write a function that combines both techniques, depending on the value of an optional parameter.
Rules
- The function must have three parameters:
- String (type text)
The text to make replacements in.
Example:"the cat sat on a mat"
- Replacements (type list)
A list of lists that contains pairs of replacement values
Example:{{"cat", "bear"}, {"dog", "dragon"}}
- ReplaceFullWords (type logical, optional)
The parameter that decides whether replacements should follow Imke’s logic (full words only) or Chris’s and Ivan’s (substrings)
Example:true
- String (type text)
- The entries will be ranked by the number of characters excluding non-essential spaces (the shortest function wins)
- Readability and performance are not important at all
- The function should assume that words are separated by spaces only
- Using
Expression.Evaluate
and referencing#shared
directly is not allowed - The function must be contained in a single query and external references, such as GitHub, are not allowed
- Not only the above data may be used to check whether the function works correctly, so hardcoded replacements are not permitted
- By taking part in this competition, you agree that your code may be published in this blog
- Anyone can enter the competition, including people from North Korea and other countries commonly excluded from web competitions
- The competition runs until UTC 12:00 pm Monday, 24 June 2019.
- Entries are accepted via email to daniil at xxlbi dot com
Sample data
Download Replacements.pbix
In the file, the Starting point query contains some sample text, and the Dictionary as list query should be used as the second argument in your function.
The Dictionary as table query exists purely for your convenience.
Expected results
When the third parameter is true
When the third parameter is false, null or omitted:
The prize
Given that this is a competition that involves Power Query, I thought it would be appropriate to have a Power Query book as a prize. There are not that many books on Power Query; Gil Raviv’s book, published by Microsoft, is the latest book on the topic, and it includes many exercises to further improve anyone’s skills.
Gil was very kind and provided a digital product voucher for his book, which you can redeem at Microsoft Press Store, or you can give it to someone else! 🙂
I may publish your code here once the competition is over. I would also prefer to publish the full name of the code author — please let me know if you’d like to stay anonymous. Also, please send me a LinkedIn/blog/website link if you’d like me to link your name.
Update: the results!
In total, I’ve received submissions from eight contestants (in order of submission):
- Maxim Zelensky
- Imke Feldmann
- Bill Szysz
- Aleksei Zhigulin
- Zubair Muhhamad
- Tanuki Phoenix
- Andrey Minakov
- Alexandr Morin
Unfortunately, I was a bit unclear in the rules: I didn’t specify whether declaring parameter types is mandatory. For this reason, I decided to count the lengths of function after the first occurrence of “=>”, excluding non-essential whitespace characters.
Some functions below may still have non-essential whitespace, which I left for readability purposes but excluded it when counting the length of functions.
Maxim Zelensky
Maxim is a Microsoft MVP in the Data Platform category (profile) from Russia, who has a blog called Excel Inside and tweets as @Hohlick. Here’s his formula:
(String as text, Replacements as list, optional ReplaceFullWords as nullable logical) as text =>
let p=if ReplaceFullWords=true then" "else""in Text.Trim(List.Accumulate(Replacements,p&String&p,(s,r)=>Text.Replace(s,p&r{0}&p,p&r{1}&p)))
According to Maxim, this is a “typical Excel approach” to replacing words. It uses the List.Accumulate function, which is a perfect fit here. You can read more about the function in Gil Raviv’s blog: Power Query List.Accumulate – Unleashed.
Length
139 characters
Imke Feldmann
Imke, who is from Germany, is also a Microsoft Data Platform MVP (profile). She blogs at The BIccountant and tweets under the same name — @TheBIccountant. Here’s Imke’s approach:
(String, Replacements, optional ReplaceFullWords) =>
let r=(z)=>if ReplaceFullWords=true then" "&z&" "else z,e=Text.Trim(List.Accumulate(Replacements," "&String&" ",(x,y) =>Text.Replace(x,r(y{0}),r(y{1}))))in e
Great minds think alike! Imke’s approach is similar to Maxim’s, she just chose to define the steps a bit differently. Maxim saved a few characters by putting the result directly into the in statement, while Imke used a separate variable for that, e.
Length
156 characters
Bill Szysz
Bill (LinkedIn) is a very knowledgeable Polish expert on Power Query, who has a YouTube channel and is a moderator on the largest Polish Excel forum — ExcelForum.pl. You may also have seen his comments on Power Query all over various community websites. Here is Bill’s formula:
(String,Replacements,optional ReplaceFullWords)=>Text.Trim([S=if ReplaceFullWords=true then" "else"",R=List.Accumulate(Replacements,S&String&S,(F,L)=>Text.Replace(F,S&L{0}&S,S&L{1}&S))][R])
Again, the approach is similar, but there’s an important difference: instead of using the let/in construction, Bill uses fields of a record, then calls a field. Not something you see every day!
Length
140 characters
Aleksei Zhigulin
Aleksei (LinkedIn) actually submitted two variants:
(String as text, Replacements as list, optional ReplaceFullWords as logical) =>
[a = Replacements,
b = String,
c = if ReplaceFullWords = true
then Text.Combine(List.ReplaceMatchingItems(Text.Split(b, " "), a), " ")
else List.Accumulate(a, b, (x,y)=>Text.Replace(x,y{0},y{1}))]
[c]
(String as text, Replacements as list, optional ReplaceFullWords as logical) =>
List.Accumulate(Replacements, String, (a,b)=>
if ReplaceFullWords = true
then Text.Combine(List.ReplaceValue(Text.Split(a, " "), b{0},b{1}, Replacer.ReplaceValue), " ")
else Text.Replace(a,b{0},b{1}))
We see List.Accumulate again, and Aleksei decided to rename the parameters, which added a few characters.
Length
183 and 190 characters, respectively.
Zubair Muhammad
Zubair (website: Excel & Access) wrote his function without List.Accumulate:
(string as text, ReplacementsList as list, optional ReplaceFullWords as logical) =>
let
Replacements=#table(
type table [
Word to Replace = text,
Replace With = text
],
ReplacementsList
),
List1=Replacements[Replace With],
List2=Replacements[Word to Replace],
myrecord=Record.FromList(List1,List2),
myresult=
if
ReplaceFullWords = true then
Text.Combine(
List.Transform(
Text.Split(string," "),each
Record.FieldOrDefault(myrecord, _, _))," ")
else
List.Last(
List.Generate(()=>
[a=0,
b=List2{a},
c=List1{a},
d=Text.Replace(string,b,c)
],
each [a] < Table.RowCount(Replacements),
each [a=[a]+1,
b=List2{a},
c=List1{a},
d=Text.Replace([d],b,c)
]
, each [d]))
in
myresult
In this code, there’s List.Generate, which requires many more characters than List.Accumulate — at least because the result is a list, so you need to pick the last item.
Length
200+ characters
Tanuki Phoenix
Tanu, who tweets as @tanuki_phoenix, also used List.Accumulate, but in his if-then-else construct he used different replacer functions: Replacer.ReplaceText for partial replacements and Replacer.ReplaceValue for full-word replacements.
Text.Combine(List.Accumulate(Replacements,Text.Split(String," "),(a,b)=>List.ReplaceValue(a,b{0},b{1},if List.Contains({null,false},ReplaceFullWords)then Replacer.ReplaceText else Replacer.ReplaceValue))," ")
Length
200+ characters
Andrey Minakov
Andrey (LinkedIn) used a combination of Imke’s original technique with List.ReplaceMatchingItems (for full-word replacements) and List.Accumulate (for partial replacements):
(String as text, Replacements as list, ReplaceFullWords as logical) as text => if ReplaceFullWords then Text.Combine(List.ReplaceMatchingItems(Text.Split(String, " "), Replacements), " ") else List.Accumulate(Replacements, String, (v, i) => Text.Replace(v, i{0}, i{1}))
Note that Andrey is using “if ReplaceFullWords” here without “= true” as most contestants use: this makes the function fail when the third parameter is null or omitted.
Length
180 characters (not adherent to competition rules)
Alexandr Morin
Alexandr is a digital marketing expert, who has a Facebook page devoted to Power BI (Power BI // Power Query // DAX) and a Telegram channel. Alexandr is the only contestant who used recursion:
(String,Replacements,optional ReplaceFullWords)=>
let R=Replacements,S=Text.Split(String," "),F=(L,P)=>let E=List.ReplaceValue(L,R{P}{0},R{P}{1},Replacer.ReplaceText)in if P=List.Count(R)-1 then E else @F(E,P+1)in Text.Combine(if ReplaceFullWords=true then List.ReplaceMatchingItems(S,R)else F(S,0)," ")
Here, renaming the parameters is justified because they are referenced multiple times.
Length
200+ characters
Winner
Congratulations to Maxim Zelensky, whose function is only 139 characters long!
I certainly have learned a few new tricks from this competition 🙂