Extracting Skills from Text: Semantics–Not Keywords–Is the ROI Differentiator (part 1 of 2)

Dr. Ayala Allon
Reading Time: 2 minutes

Automatically extracting skills from text documents like CVs and job posts is what enables successful talent mapping between individuals and potential roles. With the right mix of precision and speed, AI deployment can:

  • Accurately connect talent with the right skills to your open roles
  • Achieve best-fit matches quickly, lowering cost and speed to hire
  • Reduce bias 
  • Broaden the talent pool 

However – all AI skills extraction methods aren’t the same, and the difference matters.  

There are two main methodologies at play in the HR Tech space. In our next two blog posts, we’ll explain the ins and outs of each.

 

Explicit Skills vs. Implied Skills

 

First, let’s look at what’s in a document such as a CV. Throughout one’s job history description, there are titles, roles, tasks and skills. Parsing–the means by which data is separated into more easily processed components in order to produce a well-structured set of information–is what enables mapping ability; in this case, skills mapping.

However, CVs and job descriptions are written in a “free-text” manner, whereas a “structured” set of information would be usually in the form of a table or graph. The challenge becomes finding a platform that can scan a CV and identify both the skills that are explicitly indicated in the document as well as those not explicitly mentioned, but which are implied by the tasks the individual describes. 

Here’s an example. A financial analyst lists the following tasks on their CV: 

 

  • Maintaining and improving dashboards and calculation files of multiple reports using advanced Excel
  • Assessing credit risks  

From the description of these tasks, one can deduce the individual is proficient at the following skills:

 

  • Creating a financial report
  • Analyzing financial risk

But without those specifically-worded skills outlined in the CV, how can a machine learning platform infer such capabilities?

 

Keywords vs. Semantics: The Keywords Approach

 

Put simply, using the Keywords Approach to skills extraction means words on the keywords list must appear in the exact same way within the CV in order to be extracted. Conversely, any skills not listed on the keywords list will not be extracted. 

Here’s an example. If the skill “economics” is listed on the keyword list, it will only be extracted if it appears exactly that way on the CV. If instead, a CV includes variations of that word, such as “economist,” the keyword-focused platform will not detect it as a skill. 

Taken further, words that are included in the keywords list but appear in the CV not in the context of skills will also be extracted as skills, despite being mentioned in a different context.

Therefore, the deficiency of a keywords approach is that it can result in missing skills that aren’t explicitly mentioned in the text, or which aren’t included in the keywords list. It can also extract words as skills even though they were not mentioned in that context. In other words, keywords approach is context-free.

 

In our next blog post, we’ll explain the Semantics Approach to AI-enabled skills extraction. 

 

You can also check out our Skills Extraction whitepaper here.