Carim asked an interesting question:
Do you think a UDF could determine the string language … ?
In Switzerland, there are three national languages (German, French, Italian) and it would be very handy to have an UDF return in which language a given string is written
My first thought was to create a group of common words in each language. Then match the contents of the string to those words. I guessed that if the input string had 10 words, I would only need 50 words in the group to predict the language. I have no idea if that’s really enough, but I’d like to know your opinion.
Before I could test my theory, I came to my senses and turned to Google. I came up with the following code:
Public Function DetectLanguage(sInput As String, Optional dConfidence As Double = 0.25) As String
Dim clsLanguage As CLanguage
Dim sReturn As String
Set clsLanguage = New CLanguage
clsLanguage.InputText = sInput
If clsLanguage.Confidence > dConfidence Then
DetectLanguage = clsLanguage.LanguageCode
Else
DetectLanguage = "unknown"
End If
End Function
Nice and clean, but doesn’t really tell you much. It uses the Google AJAX Language API. Instead of me comparing ten words to 50 words in three languages, Google probably compares every word to every word in every language. Here’s some of the CLanguage class:
Public Property Let InputText(ByVal sInput As String)
Dim oHttp As XMLHTTP60
msInputText = sInput
Set oHttp = New XMLHTTP60
oHttp.Open "GET", "http://ajax.googleapis.com/ajax/services/language/detect?v=1.0&q=" & URLEncode(msInputText)
oHttp.setRequestHeader "Content-Type", "applicaton/x-www-form-urlencoded"
oHttp.send
If oHttp.Status = 200 Then
Me.ResponseText = oHttp.ResponseText
End If
End Property
When I set InputText to something, I create a new xmlhttp object, pass it some stuff, send it, then read back the ResponseText. I don’t know the ins and outs of the xmlhttp object, I just copied it from Juan Pablo. So I probably don’t have near the error checking in there that I need. The class has some properties that you can read too.
Public Property Get Confidence() As Double
Dim lStart As Long
Dim lEnd As Long
Const sKEY As String = """confidence"":"
lStart = InStr(1, Me.ResponseText, sKEY)
lEnd = InStr(lStart, Me.ResponseText, "}")
If lStart > 0 And lEnd > lStart Then
Confidence = CDbl(Mid$(Me.ResponseText, lStart + Len(sKEY), lEnd - (lStart + Len(sKEY))))
End If
End Property
I’m parsing ResponseText to get at the piece I need. Here’s an example of what the ResponseText looks like:
{”responseData”: {”language”:”en”,”isReliable”:true,”confidence”:0.75503504}, “responseDetails”: null, “responseStatus”: 200}
From what I can tell, Google determines whether the detection was reliable based on a confidence of 50%. I decided to let the user determine how confident it should be and set the default at 25%. I think in Excel, you’re going to pass smaller strings and need to be more aggressive with the guessing. Here are some of the results.
Also note that I only pull the first two letters of the language. A few of the languages have more letters, but I was too lazy to do it right.

That 90 in the last example, is 9000% so it will never pass. I meant to pass it 90%, which would give the same result. The confidence level per the ResponseText was 75.5%.
Other sources
VBA Corner
URL Encoding a String
Class Reference for Detect API