Enrichments
This page is a placeholder for documentation on the enrichment sources used by the Fantastic.jobs API (LinkedIn, Crunchbase, Glassdoor, AI-derived fields).
More details coming soon.
In addition to all the raw job data, we provide several enrichments:
locations_derived While the Job Posting Schema sets a solid standard for location data, some ATS platforms are more diligent than others in ensuring its accuracy. For this reason we’re developed a system to make sure that all locations are in a ‘city/county, region, country’ format.
We are using Geoapify which is based on OpenStreetMaps: https://www.geoapify.com/
If you are currently using your own API to normalize job location data, we recommend using locations (the source-supplied Job Posting schema objects) instead of locations_derived: https://developers.google.com/search/docs/appearance/structured-data/job-posting
Include_ai We extract useful insights from the job description with an LLM. Including Salary, Benefits, Experience Level, Detailed Remote filters, and more. Please see the attached table for all fields. We currently cover 99.9% of all jobs. Please note that we extract using a one-shot prompt using OpenAI’s 4o-mini model and hallucinations can occur.
Please note: For LinkedIn jobs, we flag certain jobs from staffing agencies, job boards that cross post on LinkedIn, and click farms. These jobs are not analysed with the LLM. The reason for this are the occasional extreme job volumes from these companies. We’ve seen an instance where a single company posted 600,000 jobs in one day. Include_li We enrich the ATS jobs with LinkedIn company profile data. We've developed a process where we are able to map over 99% of all jobs to a profile with an accuracy of 99.4%. If you find any inaccuracies, please report and we will modify these in our database. domain_derived Since ATS platforms don’t always include the domain or homepage of the company, we’ve developed a system to identify the company’s domain. This can be very helpful to do further analysis and enrichments. The accuracy sits at 98% with over 96% of jobs populated. During testing, we found greater accuracy for United States companies, especially medium to large-sized. We found lower accuracy for non-US companies, as well as companies with generic names. exclude_ats_duplicate We have released a beta version of our LinkedIn/ATS deduplication, specifically for users that access both datasets. We have created a system where every LinkedIn job is checked against the ATS dataset. This system will perform 2 checks for every LinkedIn job: • A match of job title + organization name • A match of job title + LinkedIn company profile mapping If either of these checks has a hit, the LinkedIn job will be flagged as ats_duplicate=true in the API output. If neither has a hit, the LinkedIn job will be flagged as ats_duplicate=false Some jobs are not checked; these are jobs that originate from agencies/jobboards (linkedin_org_recruitment_agency_derived=true) or jobs with LinkedIn EasyApply (direct_apply=true). These jobs will be flagged as ats_duplicate=null
We are hoping to flag the majority of duplicates in the datasets, but we are looking for exact hits only. This means that there will still be a number of false positives slipping through the cracks. To fully eliminate duplicates between the two datasets, we recommend adding a layer of fuzzy deduplication.
During our checks, we found the false positives to stem mostly from jobs with the following characteristics:
- The job on LinkedIn is indexed before the ATS.
- The job is older than 6 months on the ATS platform, but new on LinkedIn. Jobs older than 6 months are expired and not part of the de-duplication checks. This could be seen as a positive as well, indicating that the employer wants to bring attention to an older job post.
- The job on LinkedIn might be posted with a programmatic platform like Appcast or Adzuna with minor changes in the job Title or Organization name
- The job has a modified Title or Organization on LinkedIn vs the ATS