In this tutorial we are going to see how to use the Text analytics API of the Cognitive Services to help you extract language, keywords, sentiment and from your text. You can call the Text Analytics APIs directly but using the Microsoft.Azure.CognitiveServices.Language SDK is easier.
Prerequisites
To run the sample code you must have an edition of Visual Studio installed.
You will need an Azure Cognitive Services key. Follow this tutorial to get one. If you don’t have an Azure account, you can use the free trial to get a subscription key.
Create the Project
To create an application follow the steps below:
Create a .NET Core Console Application in Visual Studio 2017
Add the Microsoft.Azure.CognitiveServices.Language SDK NuGet package by using the NuGet Package Manager Console. If you choose to install it via the GUI make sure you check the Prerelease checkbox.
private const string SubscriptionKey = ""; //Insert your Text Anaytics subscription key
class ApiKeyServiceClientCredentials : ServiceClientCredentials
{
public override Task ProcessHttpRequestAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
request.Headers.Add("Ocp-Apim-Subscription-Key", SubscriptionKey);
return base.ProcessHttpRequestAsync(request, cancellationToken);
}
}
Then start building your client. Add the following code in Main function to create the client. Replace the location in Endpoint to the endpoint you signed up for. You can find the endpoint on Azure portal resource. The endpoint typically starts with “https://[region].api.cognitive.microsoft.com”, and in here only include protocol and hostname.
ITextAnalyticsClient client = new TextAnalyticsClient(new ApiKeyServiceClientCredentials())
{
Endpoint = "https://westeurope.api.cognitive.microsoft.com/"
}; //Replace endpoint with the correct region for your Text Analytics subscription
}; //Replace endpoint with the correct region for your Text Analytics subscription
ITextAnalyticsClient client = new TextAnalyticsClient(new ApiKeyServiceClientCredentials())
{
Endpoint = "https://westeurope.api.cognitive.microsoft.com/"
}; //Replace endpoint with the correct region for your Text Analytics subscription
Detect Language
Continue in Main and add the following code for Language Detection. You can use Batch Input to add Multiple Documents. You can iterate through the results using the result.Documents Collection
var result = client.DetectLanguageAsync(new BatchInput(
new List<Input>()
{
new Input("1", "This is a document written in English."),
new Input("2", "Este es un document escrito en Español."),
new Input("3", "这是一个用中文写的文件")
})).Result;
// Printing language results.
foreach (var document in result.Documents)
{
Console.WriteLine($"Document ID: {document.Id} , Language: {document.DetectedLanguages[0].Name}");
}
var result = client.DetectLanguageAsync(newBatchInput(
new List<Input>()
{
newInput("1", "This is a document written in English."),
newInput("2", "Este es un document escrito en Español."),
var result = client.DetectLanguageAsync(new BatchInput(
new List<Input>()
{
new Input("1", "This is a document written in English."),
new Input("2", "Este es un document escrito en Español."),
new Input("3", "这是一个用中文写的文件")
})).Result;
// Printing language results.
foreach (var document in result.Documents)
{
Console.WriteLine($"Document ID: {document.Id} , Language: {document.DetectedLanguages[0].Name}");
}
There are some limits you should be aware of in every request. All of the Text Analytics API endpoints accept raw text data. The current limit is 5,120 characters for each document; if you need to analyze larger documents, you can break them up. The rate limit is 100 calls per minute but you can submit a large quantity of documents in a single call (up to 1000 documents).
Limit
Value
Maximum size of a single document
5120 characters
Maximum size of entire request
1MB
Maximum number of documents in a request
1000 Documents
Detect Key-phrases
Το detect Key Phrases add the following code. You can iterate through the results using the result2.Documents Collection.
KeyPhraseBatchResult result2 = client.KeyPhrasesAsync(new MultiLanguageBatchInput(
new List<MultiLanguageInput>()
{
new MultiLanguageInput("ja", "1", "猫は幸せ"),
new MultiLanguageInput("de", "2", "Fahrt nach Stuttgart und dann zum Hotel zu Fu."),
new MultiLanguageInput("en", "3", "My cat is stiff as a rock."),
new MultiLanguageInput("es", "4", "A mi me encanta el fútbol!")
})).Result;
// Printing keyphrases
foreach (var document in result2.Documents)
{
Console.WriteLine($"Document ID: {document.Id} ");
Console.WriteLine("\t Key phrases:");
foreach (string keyphrase in document.KeyPhrases)
{
Console.WriteLine($"\t\t{keyphrase}");
}
}
KeyPhraseBatchResult result2 = client.KeyPhrasesAsync(new MultiLanguageBatchInput(
new List<MultiLanguageInput>()
{
new MultiLanguageInput("ja", "1", "猫は幸せ"),
new MultiLanguageInput("de", "2", "Fahrt nach Stuttgart und dann zum Hotel zu Fu."),
new MultiLanguageInput("en", "3", "My cat is stiff as a rock."),
new MultiLanguageInput("es", "4", "A mi me encanta el fútbol!")
})).Result;
// Printing keyphrases
foreach (var document in result2.Documents)
{
Console.WriteLine($"Document ID: {document.Id} ");
Console.WriteLine("\t Key phrases:");
foreach (string keyphrase in document.KeyPhrases)
{
Console.WriteLine($"\t\t{keyphrase}");
}
}
Extract Sentiment
Το detect Sentiment add the following code. You can iterate through the results using the result3.Documents Collection. The score shows the Sentiment. The higher it is the more positive the sentence. Score returns a value from 0 to 1;
SentimentBatchResult result3 = client.SentimentAsync(
new MultiLanguageBatchInput(
new List<MultiLanguageInput>()
{
new MultiLanguageInput("en", "0", "I had the best day of my life."),
new MultiLanguageInput("en", "1", "This was a waste of my time. The speaker put me to sleep."),
new MultiLanguageInput("es", "2", "No tengo dinero ni nada que dar..."),
new MultiLanguageInput("it", "3", "L'hotel veneziano era meraviglioso. È un bellissimo pezzo di architettura."),
})).Result;
// Printing sentiment results
foreach (var document in result3.Documents)
{
Console.WriteLine($"Document ID: {document.Id} , Sentiment Score: {document.Score:0.00}");
}
SentimentBatchResult result3 = client.SentimentAsync(
new MultiLanguageBatchInput(
new List<MultiLanguageInput>()
{
new MultiLanguageInput("en", "0", "I had the best day of my life."),
new MultiLanguageInput("en", "1", "This was a waste of my time. The speaker put me to sleep."),
new MultiLanguageInput("es", "2", "No tengo dinero ni nada que dar..."),
new MultiLanguageInput("it", "3", "L'hotel veneziano era meraviglioso. È un bellissimo pezzo di architettura."),
})).Result;
// Printing sentiment results
foreach (var document in result3.Documents)
{
Console.WriteLine($"Document ID: {document.Id} , Sentiment Score: {document.Score:0.00}");
}
Identify Entities
Το find Entities add the following code. You can iterate through the results using the result4.Documents Collection.
EntitiesBatchResultV2dot1 result4 = client.EntitiesAsync(
new MultiLanguageBatchInput(
new List<MultiLanguageInput>()
{
new MultiLanguageInput("en", "0", "The Great Depression began in 1929. By 1933, the GDP in America fell by 25%.")
})).Result;
// Printing entities results
foreach (var document in result4.Documents)
{
Console.WriteLine($"Document ID: {document.Id} ");
Console.WriteLine("\t Entities:");
foreach (EntityRecordV2dot1 entity in document.Entities)
{
Console.WriteLine($"\t\t{entity.Name}\t\t{entity.WikipediaUrl}\t\t{entity.Type}\t\t{entity.SubType}");
}
}
EntitiesBatchResultV2dot1 result4 = client.EntitiesAsync(
new MultiLanguageBatchInput(
new List<MultiLanguageInput>()
{
new MultiLanguageInput("en", "0", "The Great Depression began in 1929. By 1933, the GDP in America fell by 25%.")
})).Result;
// Printing entities results
foreach (var document in result4.Documents)
{
Console.WriteLine($"Document ID: {document.Id} ");
Console.WriteLine("\t Entities:");
foreach (EntityRecordV2dot1 entity in document.Entities)
{
Console.WriteLine($"\t\t{entity.Name}\t\t{entity.WikipediaUrl}\t\t{entity.Type}\t\t{entity.SubType}");
}
}
You can find the complete source code in my Github in this repository.
Georgia Kalyva
Georgia Kalyva is a Microsoft AI MVP with years of experience in software engineering and is currently working for ITT as Web Applications Developer. She is passionate about AI and Azure and has represented Greece in global competitions in Technology and Entrepreneurship. She is also a member of the Microsoft Student Partners team and has taken over the role of mentor in several Microsoft competitions and trainings.