Extracting Skills from Text: Semantics Is the ROI Differentiator (1)

Platform
- - - Explore how retrain.ai is the leading Talent Intelligence Platform, powered by responsible AI.
      
      BOOK DEMO
Why retrain.ai
- - - Learn what makes our Talent Intelligence Platform the most effective on the market.
      
      LEARN MORE
Industries
- - - Gain a competitive edge and weather industry ups and downs by developing a more resilient workforce today.
      
      LEARN MORE
Resources
- - - Explore expert-authored articles, join information-packed webinars and start future-proofing your workforce now.
      
      LEARN MORE
Company
- - - Learn how we’re empowering the world’s workforce with AI-driven skills intelligence.
      
      LEARN MORE
Contact
Get A Demo
Login

Back to Resources

Automatically extracting skills from text documents like CVs and job posts is what enables successful talent mapping between individuals and potential roles. With the right mix of precision and speed, AI deployment can:

Accurately connect talent with the right skills to your open roles
Achieve best-fit matches quickly, lowering cost and speed to hire
Reduce bias
Broaden the talent pool

However – all AI skills extraction methods aren’t the same, and the difference matters.

There are two main methodologies at play in the HR Tech space. In our next two blog posts, we’ll explain the ins and outs of each.

Explicit Skills vs. Implied Skills

First, let’s look at what’s in a document such as a CV. Throughout one’s job history description, there are titles, roles, tasks and skills. Parsing–the means by which data is separated into more easily processed components in order to produce a well-structured set of information–is what enables mapping ability; in this case, skills mapping.

However, CVs and job descriptions are written in a “free-text” manner, whereas a “structured” set of information would be usually in the form of a table or graph. The challenge becomes finding a platform that can scan a CV and identify both the skills that are explicitly indicated in the document as well as those not explicitly mentioned, but which are implied by the tasks the individual describes.

Here’s an example. A financial analyst lists the following tasks on their CV:

Maintaining and improving dashboards and calculation files of multiple reports using advanced Excel
Assessing credit risks

From the description of these tasks, one can deduce the individual is proficient at the following skills:

Creating a financial report
Analyzing financial risk

But without those specifically-worded skills outlined in the CV, how can a machine learning platform infer such capabilities?

Keywords vs. Semantics: The Keywords Approach

Put simply, using the Keywords Approach to skills extraction means words on the keywords list must appear in the exact same way within the CV in order to be extracted. Conversely, any skills not listed on the keywords list will not be extracted.

Here’s an example. If the skill “economics” is listed on the keyword list, it will only be extracted if it appears exactly that way on the CV. If instead, a CV includes variations of that word, such as “economist,” the keyword-focused platform will not detect it as a skill.

Taken further, words that are included in the keywords list but appear in the CV not in the context of skills will also be extracted as skills, despite being mentioned in a different context.

Therefore, the deficiency of a keywords approach is that it can result in missing skills that aren’t explicitly mentioned in the text, or which aren’t included in the keywords list. It can also extract words as skills even though they were not mentioned in that context. In other words, keywords approach is context-free.

In our next blog post, we’ll explain the Semantics Approach to AI-enabled skills extraction.

You can also check out our Skills Extraction whitepaper here.