Ben  Franklin Ben Franklin | 17 Feb 2021

Azure Cognitive Search is a powerful tool for ensuring data is more readily organised, searchable and retrievable. But it is also giving end users new ways to find and interact with content. In Part 1 of this series, we’ll look at how Cognitive Search retrieves data from different types of content, making it accessible to search, saving time and project resources.

Ingest, Enrich, Explore

Azure articulates the principles driving their Cognitive Search as ‘Ingest, Enrich, Explore’, but what those words mean depends on the business at hand. This isn’t coyness. The fact that these three words aren’t rigid but malleable and dynamic give Azure Cognitive Search an ability to help a business grow an intelligent digital ecosystem. One that can:

  • Remove the need for manual asset tagging and tidying
  • Retrieve quality search results from imperfect queries
  • Identify complex relationships between content 
  • Leverage Microsoft’s Artificial Intelligence (AI) power to aid machine learning

The first three qualities continually evolve, thanks to the last one, which means Cognitive Search improves as it is given more information. The more there is to search, the better the results can be.

The principles have basic definitions that can be universally applied to any business:

  • Ingest—to take in the organisation’s data that will inform the search
  • Enrich—to apply a choice of cognitive ‘skills’ to the data to aid search performance
  • Explore—to take the results of ‘Ingest’ and ‘Enrich’ to view the content and the relationships between the content in new ways

Extract information from many file types

‘Ingest’ and ‘Enrich’ show the difference between Cognitive Search and ordinary search function straight away. When you upload your data to Cognitive Search, it extracts information from a huge variety of file types. It isn’t limited to text classifiers such as tags or descriptions. It is able to generate usable data from many different file types, including (but not limited to):

  • DOC
  • PDF
  • JPG
  • XLS
  • XML
  • HTML
  • JSON

When that data is ingested, it is enriched by the application of cognitive skills, a palette of tools that assess the content and index it. Azure comes with lots of these skills out of the box, some very good ones. And there is scope to add custom skills that are industry or project specific.

Importantly, it is capable of working with both structured and unstructured data. In other words, previously it would have been critical to ensure that every classifier was in the correct field in the database before running a query if you wanted to get an accurate search result. 

Cognitive Search, however, is capable of figuring out what information belongs to which field for each asset. A photograph of a best-selling product held by a smiling person in front of the Eiffel Tower, for instance, which has landmark, location and mood recognition cognitive skills run on it would surface on searches for Paris, happy, Eiffel Tower, France, city, the product brand name or perhaps some interesting combination of similar terms.  

Speak the user’s language

Crucially, it will also surface this photo if the query runs something like ‘Iffel Twoer’, a hasty or common misspelling of one of the classifiers that Cognitive Search recognises. It may also come up on a query such as ‘I would like to see a picture of Paris.’ In this case, it’s a more conversational query using natural language, and Cognitive Search is able to filter out the less important words.

If you’ve ever run a search on a website using natural language and gotten a lot of results that highlighted the word ‘the’ but not the relevant word, you understand why this filter ability is important. 

And it’s also quite sophisticated. Azure’s language recognition isn’t limited to the language that the query is given. It is able to retrieve searchable content that includes the search keyword in other languages as well. So if you search for ‘French cities’, you’ll get your photo of the Eiffel Tower as well as perhaps a tourist brochure PDF showcasing les villes françaises. Cognitive Search uses optical character recognition to pull text from images and PDFs, and it uses language recognition and translation to determine that this brochure is a suitable result for ‘French cities’.  

Put power into play

There are far too many individual features to describe in detail in one article. And in any case, it’s nice that you can find a photo of the Eiffel Tower without having to tag it, but what does it mean for business?  

It’s different for everyone. And its power is that it can be made to work for any type of business. Microsoft has an interesting knowledge mining case study  from engineering firm Howden. They’ve streamlined their bidding process by making all their relevant assets searchable in new ways. Consider a much more simplistic example.

Imagine a catering wholesaler with a massive online catalogue. Many customers search for ‘pie’. Most pie seekers end up actually purchasing apple pie. Maybe categorising other apple pastries under ‘pie’ might get clients who buy apple pie to look at other things that are very like apple pie. The wholesaler might shift more product that way.

Tagging every apple product with #pie is a lot of work, though. They have dozens of relevant products, and new ones all the time. Putting a rule in place to tag anything containing the word ‘apple’ as #pie accidentally includes a bramley apple sausage roll, an apple corer and other irrelevant items. What to do?

Imagine if the catering wholesaler didn’t need to do this work at all? Instead they could give the search engine some parameters for what qualifies something as a pie, and then every time they upload a product, it could figure it out on its own? The cognitive skills would look at the text of the description, the photograph, even a handwritten recipe, as well as how other pies have previously been categorised, and then say, apple danish? Yes, it’s #pie; apple pattern apron? No. 

This example is scaled down for simplicity, but imagine an organisation with many different types of assets across different locations. Or consider a business that has to bring together many resources, including people from different teams with different qualifications, to work on a project in one place at one time. Retrieving that availability, keeping track of project eligibility, inventory, costs and more could be made visible in one search.

If Azure Cognitive Search sounds like it might be right for your business, let’s talk more about it. Contact our Technical Team.