Extracting Skills from Text: Semantics–Not Keywords–Is the ROI Differentiator (part 2 of 2)

In our previous post, we talked about the difference between explicit skills and implied skills, and explained the Keyword Approach to skills extraction from text documents like CVs and job posts. We also outlined the importance of this automation. The right mix of precision and speed in AI deployment can:

  • Accurately connect talent with the right skills to your open roles
  • Achieve best-fit matches quickly, lowering cost and speed to hire
  • Reduce bias
  • Broaden the talent pool

Let’s now look at the second methodology.

 

Keywords vs. Semantics: The Semantics Approach

 

Semantic Analysis of text is the ability to construct logical representation of the meaning of the text as a whole, the same way we as humans understand natural language. A key factor in constructing the meaning of words is the ability to understand them based on context. For example, the word “bank” can have a different meaning depending on the context in which it appears. If a friend picks up their paycheck and says he’s going to the bank, you know he’s headed to a financial institution, not a large pile of snow, or snow-bank. 

Our automatic human understanding of the meaning of words comes from a concept called NLP–Natural Language Processing. Through NLP, we understand words based on their context; neighbor words are the most important influencers on context, but distant words can also have an effect.  

As such, if we want computers to understand words the way humans do, it means their ability to interpret a word based on the context in which it appears is a key factor.

NLP can be used to build machines that understand and respond to text or voice data in much the same way humans do. To extract skills from free-text documents like CVs, the retrain.ai Talent Intelligence Platform uses Deep Learning NLP models called transformers, a type of language model that processes each word in a sentence in relation to all the other words in the sentence, rather than processing each word individually. 

Unlike the Keyword Approach, in which text must appear in the exact same way within the CV in order to be extracted, Semantic Analysis automatically extracts skills both when they are explicitly written in the text and when they are implied by the tasks the individual describes in their CV.

For instance, using the financial analyst example from our previous post [[link]], the Semantics Approach will recognize the keyword “economics” and it will also interpret a sentence like “Maintaining and improving dashboards and calculation files of multiple reports using advanced Excel” to extract “create financial reports” as a skill. 

Conversely, Semantic Analysis will not extract a word that could be considered a skill, if it doesn’t appear in the right context. For example, if an individual describes working as an “Office Manager in the Economics Department,” Deep Learning NLP models will not detect that person has the skill “economics” just because the word is there.

 

Unique Needs of HR

 

Workforce roles and skills are evolving at breakneck speed, with new capabilities in demand and more open jobs than there are people to fill them. To optimize talent acquisition, management and retention, HR leaders need automation that speaks their ever-evolving language. 

Our retrain.ai Talent Intelligence Platform uses semantics-based machine learning models to provide the most accurate, actionable data possible. We empower enterprises to succeed through skills-based hiring, talent mapping capabilities, inner mobility and retention initiatives, and personalized learning and development programs for every single employee. 

If you’d like to see how our sophisticated, Responsible AI-driven Talent Intelligence platform transforms workforce planning, we’d love to show you. Book a Demo to participate in a tailored walk-through based on your organization’s specific needs for hiring, upskilling and retaining quality talent.

Extracting Skills from Text: Semantics–Not Keywords–Is the ROI Differentiator (part 1 of 2)

Automatically extracting skills from text documents like CVs and job posts is what enables successful talent mapping between individuals and potential roles. With the right mix of precision and speed, AI deployment can:

  • Accurately connect talent with the right skills to your open roles
  • Achieve best-fit matches quickly, lowering cost and speed to hire
  • Reduce bias 
  • Broaden the talent pool 

However – all AI skills extraction methods aren’t the same, and the difference matters.  

There are two main methodologies at play in the HR Tech space. In our next two blog posts, we’ll explain the ins and outs of each.

 

Explicit Skills vs. Implied Skills

 

First, let’s look at what’s in a document such as a CV. Throughout one’s job history description, there are titles, roles, tasks and skills. Parsing–the means by which data is separated into more easily processed components in order to produce a well-structured set of information–is what enables mapping ability; in this case, skills mapping.

However, CVs and job descriptions are written in a “free-text” manner, whereas a “structured” set of information would be usually in the form of a table or graph. The challenge becomes finding a platform that can scan a CV and identify both the skills that are explicitly indicated in the document as well as those not explicitly mentioned, but which are implied by the tasks the individual describes. 

Here’s an example. A financial analyst lists the following tasks on their CV: 

 

  • Maintaining and improving dashboards and calculation files of multiple reports using advanced Excel
  • Assessing credit risks  

From the description of these tasks, one can deduce the individual is proficient at the following skills:

 

  • Creating a financial report
  • Analyzing financial risk

But without those specifically-worded skills outlined in the CV, how can a machine learning platform infer such capabilities?

 

Keywords vs. Semantics: The Keywords Approach

 

Put simply, using the Keywords Approach to skills extraction means words on the keywords list must appear in the exact same way within the CV in order to be extracted. Conversely, any skills not listed on the keywords list will not be extracted. 

Here’s an example. If the skill “economics” is listed on the keyword list, it will only be extracted if it appears exactly that way on the CV. If instead, a CV includes variations of that word, such as “economist,” the keyword-focused platform will not detect it as a skill. 

Taken further, words that are included in the keywords list but appear in the CV not in the context of skills will also be extracted as skills, despite being mentioned in a different context.

Therefore, the deficiency of a keywords approach is that it can result in missing skills that aren’t explicitly mentioned in the text, or which aren’t included in the keywords list. It can also extract words as skills even though they were not mentioned in that context. In other words, keywords approach is context-free.

 

In our next blog post, we’ll explain the Semantics Approach to AI-enabled skills extraction. 

 

You can also check out our Skills Extraction whitepaper here.