
The Embedding API generates SOTA, HR-native, cross-lingual vectors for Profiles and Jobs to power similarity, semantic search, clustering, deduplication, and HR-grade Machine learning. Pretrained on 1.2B hiring decisions, it captures hiring signals beyond keywords across job families, industries, and seniority, while enforcing built-in fairness and EU AI Act compliance.
{ "parsing": { "model": "hrflow-file-v2.1", "confidence": 0.92, }, "profile": { "name": "John Smith", "title": "Data Scientist", "skills": ["ML", "Python"] } }
Trusted by Customers, Partners & the AI Ecosystem

Get HR-native embedding vectors (128d or 1024d float arrays) for Profiles, Jobs, or Text to power semantic search, matching, clustering, deduplication, and HR-grade ML pipelines.
Results are strictly deterministic: same inputs → same vectors.
Profile
key, reference, or profile object
Algorithm
profile-encoder, cross-encoder, dual-encoder
Dimension
128d or 1024d
Output Format
float array / base64
1{
2 "code": 200,
3 "message": "Profile embedded in 18ms.",
4 "data": {
5 "type": "profile",
6 "key": "abc123",
7 "model": "hrflow-embed-v3",
8 "dimensions": 1024,
9 "embedding": [
10 0.0234, -0.1823, 0.4521, 0.3017,
11 -0.2198, 0.1456, 0.0891, -0.3342,
12 0.2765, 0.0543, -0.1987, 0.4102,
13 "... 1012 more values ..."
14 ],
15 "normalized": true
16 }
17}Pick the right encoder and dimension version for your needs— by use case, data object, precision, and cost.
| Algorithm key | Data | Use Case | Speed | Precision | Languages | Dimensions | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Hyperion Job-encoder | Job | Vectorize Jobs using their full role requirements, context, and work environment. Best for: Job ↔ Job similarity, Job clustering, Job classification. | 3ms/request | A+ | 43 languages | 128d, 1024d | |||||||
| Hyperion Profile-encoder | Profile | Vectorize Profiles using their full career trajectory & goals, end-to-end work experience, education, and skills. Best for: Profile ↔ Profile similarity, Profile clustering, Profile classification. | 3ms/request | A+ | 43 languages | 128d, 1024d | |||||||
| Hyperion Dual-encoder | ProfileJobText | Vectorize Profiles and Jobs in a shared latent space using independent representation. Best for: same-type (Profile ↔ Profile or Job ↔ Job) matching, similarity, clustering, classification. | 3ms/request | A+ | 43 languages | 128d, 1024d | |||||||
| Hyperion Cross-encoder | ProfileJobText | Vectorize Profiles and Jobs in the same vector space using pairwise representation optimised for matching. Best for: cross-type (Profile ↔ Job) matching, similarity, clustering, classification. | 3ms/request | A+ | 43 languages | 128d, 1024d | |||||||
Trusted by fast-growing HR Tech and Global Enterprise
Major fairness and compliance risks: male anchor profiles returned mostly male results, location proxies created allocation and representation biases we couldn't justify under GDPR or the EU AI Act, undisclosed training data, and prompt-injection concerns inside resumes.
HrFlow.ai Embedding gave us deterministic similarity with built-in fairness controls and HR-safe ingestion.
OpenAI embeddings treated past and recent experience similarly, and missed certifications, seniority fit, and trajectory signals. We got "semantic twins", not "hiring twins".
HrFlow.ai Embedding delivered Profile-to-Profile similarity grounded in hiring outcomes. OpenAI focused on words, while HrFlow.ai focuses on successful profiles for the same job.
Google vectors mostly clustered job descriptions. We needed "next-apply similarity": jobs a candidate is likely to click and apply to after liking a role.
HrFlow.ai Jobs Matching uses Job encoders trained on application signals, so "similar jobs" became "similar intent", and our engagement and time-to-fill improved without extra tuning.
Major fairness and compliance risks: male anchor profiles returned mostly male results, location proxies created allocation and representation biases we couldn't justify under GDPR or the EU AI Act, undisclosed training data, and prompt-injection concerns inside resumes.
HrFlow.ai Embedding gave us deterministic similarity with built-in fairness controls and HR-safe ingestion.
OpenAI embeddings treated past and recent experience similarly, and missed certifications, seniority fit, and trajectory signals. We got "semantic twins", not "hiring twins".
HrFlow.ai Embedding delivered Profile-to-Profile similarity grounded in hiring outcomes. OpenAI focused on words, while HrFlow.ai focuses on successful profiles for the same job.
Google vectors mostly clustered job descriptions. We needed "next-apply similarity": jobs a candidate is likely to click and apply to after liking a role.
HrFlow.ai Jobs Matching uses Job encoders trained on application signals, so "similar jobs" became "similar intent", and our engagement and time-to-fill improved without extra tuning.
Major fairness and compliance risks: male anchor profiles returned mostly male results, location proxies created allocation and representation biases we couldn't justify under GDPR or the EU AI Act, undisclosed training data, and prompt-injection concerns inside resumes.
HrFlow.ai Embedding gave us deterministic similarity with built-in fairness controls and HR-safe ingestion.
OpenAI embeddings treated past and recent experience similarly, and missed certifications, seniority fit, and trajectory signals. We got "semantic twins", not "hiring twins".
HrFlow.ai Embedding delivered Profile-to-Profile similarity grounded in hiring outcomes. OpenAI focused on words, while HrFlow.ai focuses on successful profiles for the same job.
Google vectors mostly clustered job descriptions. We needed "next-apply similarity": jobs a candidate is likely to click and apply to after liking a role.
HrFlow.ai Jobs Matching uses Job encoders trained on application signals, so "similar jobs" became "similar intent", and our engagement and time-to-fill improved without extra tuning.
Major fairness and compliance risks: male anchor profiles returned mostly male results, location proxies created allocation and representation biases we couldn't justify under GDPR or the EU AI Act, undisclosed training data, and prompt-injection concerns inside resumes.
HrFlow.ai Embedding gave us deterministic similarity with built-in fairness controls and HR-safe ingestion.
OpenAI embeddings treated past and recent experience similarly, and missed certifications, seniority fit, and trajectory signals. We got "semantic twins", not "hiring twins".
HrFlow.ai Embedding delivered Profile-to-Profile similarity grounded in hiring outcomes. OpenAI focused on words, while HrFlow.ai focuses on successful profiles for the same job.
Google vectors mostly clustered job descriptions. We needed "next-apply similarity": jobs a candidate is likely to click and apply to after liking a role.
HrFlow.ai Jobs Matching uses Job encoders trained on application signals, so "similar jobs" became "similar intent", and our engagement and time-to-fill improved without extra tuning.
Cohere gave us cosine similarity, not outcome-based similarity. Two profiles could be equally close in vector space, yet only one consistently succeeds for the same role. We kept adding heuristics to compensate.
HrFlow.ai Embedding replaced that with an HR-native Profile encoder trained on real hiring and application signals, so similarity reflects outcomes, not just semantics.
Most embedding vendors treated resumes and job descriptions like flat blobs of text, ignoring career trajectory and structured HR context. Token-based pricing also became expensive at scale.
HrFlow.ai gave us hierarchical HR-native encoders, custom feature management for metadata and assessments, and cross-lingual robustness, so fit scoring became both more accurate and more stable across languages.
Mistral gave us a generic embedding layer, but not HR-native similarity quality. We kept cycling through Hugging Face encoders, managing migrations, and adding heuristics, yet results stayed inconsistent across roles and industries.
HrFlow.ai Embedding shipped the full HR-native stack: Profile/Job encoders, deterministic scoring, custom features, and reasoning—so we stopped building the missing layers ourselves.
Cohere gave us cosine similarity, not outcome-based similarity. Two profiles could be equally close in vector space, yet only one consistently succeeds for the same role. We kept adding heuristics to compensate.
HrFlow.ai Embedding replaced that with an HR-native Profile encoder trained on real hiring and application signals, so similarity reflects outcomes, not just semantics.
Most embedding vendors treated resumes and job descriptions like flat blobs of text, ignoring career trajectory and structured HR context. Token-based pricing also became expensive at scale.
HrFlow.ai gave us hierarchical HR-native encoders, custom feature management for metadata and assessments, and cross-lingual robustness, so fit scoring became both more accurate and more stable across languages.
Mistral gave us a generic embedding layer, but not HR-native similarity quality. We kept cycling through Hugging Face encoders, managing migrations, and adding heuristics, yet results stayed inconsistent across roles and industries.
HrFlow.ai Embedding shipped the full HR-native stack: Profile/Job encoders, deterministic scoring, custom features, and reasoning—so we stopped building the missing layers ourselves.
Cohere gave us cosine similarity, not outcome-based similarity. Two profiles could be equally close in vector space, yet only one consistently succeeds for the same role. We kept adding heuristics to compensate.
HrFlow.ai Embedding replaced that with an HR-native Profile encoder trained on real hiring and application signals, so similarity reflects outcomes, not just semantics.
Most embedding vendors treated resumes and job descriptions like flat blobs of text, ignoring career trajectory and structured HR context. Token-based pricing also became expensive at scale.
HrFlow.ai gave us hierarchical HR-native encoders, custom feature management for metadata and assessments, and cross-lingual robustness, so fit scoring became both more accurate and more stable across languages.
Mistral gave us a generic embedding layer, but not HR-native similarity quality. We kept cycling through Hugging Face encoders, managing migrations, and adding heuristics, yet results stayed inconsistent across roles and industries.
HrFlow.ai Embedding shipped the full HR-native stack: Profile/Job encoders, deterministic scoring, custom features, and reasoning—so we stopped building the missing layers ourselves.
Cohere gave us cosine similarity, not outcome-based similarity. Two profiles could be equally close in vector space, yet only one consistently succeeds for the same role. We kept adding heuristics to compensate.
HrFlow.ai Embedding replaced that with an HR-native Profile encoder trained on real hiring and application signals, so similarity reflects outcomes, not just semantics.
Most embedding vendors treated resumes and job descriptions like flat blobs of text, ignoring career trajectory and structured HR context. Token-based pricing also became expensive at scale.
HrFlow.ai gave us hierarchical HR-native encoders, custom feature management for metadata and assessments, and cross-lingual robustness, so fit scoring became both more accurate and more stable across languages.
Mistral gave us a generic embedding layer, but not HR-native similarity quality. We kept cycling through Hugging Face encoders, managing migrations, and adding heuristics, yet results stayed inconsistent across roles and industries.
HrFlow.ai Embedding shipped the full HR-native stack: Profile/Job encoders, deterministic scoring, custom features, and reasoning—so we stopped building the missing layers ourselves.
Integrate 200+ tools with the flip of a switch.
















































HR-native ETL with 200+ connectors plus Webhooks to ingest, normalize, and sync jobs & profiles across your stack, reliable pipelines with unified schemas.
No-code automation platform with 8,000+ app integrations to move data between tools using triggers + actions.
Visual automation platform to extract/transform/route data across 3,000+ apps (plus HTTP modules for any API).
Microsoft Power Automate, workflow automation with 1,000+ API connectors (and support for custom connectors).
Enterprise iPaaS/automation platform with 1,200+ pre-built connectors for orchestrating integrations and data workflows at scale.
Salesforce's low-code workflow automation tool; extended via AppExchange with 7,000+ apps to add integrations and capabilities.
HrFlow.ai Embedding generates HR-native vectors for Profiles and Jobs with real-time latency (~3ms/request) and batch throughput (~24K embeddings/min). It powers cosine-similarity fit scores (Job→Job, Profile→Profile, Profile↔Job) and supports profile/job/dual/cross encoders for best-fit matching. Vectors are cross-lingual (43+ languages), 128- or 1024-dimensional, and generalize across roles, seniority, and industries. Encoders are pretrained on 1.2B hiring decisions and applications, with fairness-regularized training on representation-bias–calibrated datasets aligned with EU AI Act-style governance.
Built for sensitive HR data—secure by default, enterprise-ready.
TLS in transit + encryption at rest to protect documents and extracted data.
Minimal storage by default, with configurable retention policies to match your compliance needs.
Built for sensitive HR data—secure by default, enterprise-ready. AI Act– and GDPR-ready processing, with documented controls for data handling and compliance.
Data processing and storage can be aligned with your required region (e.g., EU or US) depending on your deployment.
| Feature | OSE | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Deployment & Trust | |||||||||||||
| Headquarters | 🇫🇷 France | 🇺🇸 USA | 🇺🇸 USA | 🇨🇳 China | 🇺🇸 USA | config | |||||||
| 🇺🇸 USA & 🇪🇺 EU Servers | Built-in | config | config | config | config | ||||||||
| GDPR / AI-Act readiness | By design | ||||||||||||
| HR Compliance (Safety & Guardrails) | Built-in | ||||||||||||
| HR-Focused | |||||||||||||
| Pretraining Data | 1.2B Hiring Signals (Top hiring firms) | Noisy & Biased Web Data | Noisy & Biased Web Data | Noisy & Biased Web Data | Noisy & Biased Web Data | Noisy & Biased Web Data | |||||||
| Input Security (Prompt injection) | |||||||||||||
| Pricing model | per request | per input tokens (unpredictable) | per input tokens (unpredictable) | per input tokens (unpredictable) | per input tokens (unpredictable) | per server (expensive) | |||||||
| Speed (cv=2k tokens/request) | ~100ms | ~3s | ~200ms | ~160ms | ~500ms | config | |||||||
| Rate-limit (cv=2k tokens) | ~24k vec/min | 2500 vec/min | ~2000 vec/min | ~36k vec/min | ~500 vec/min | config | |||||||
| Vector database | Built-in (Matching API) | config | Built-in | config | |||||||||
| DevOps burden (production scale) | Lowest | Medium | Low | Medium | High | High | |||||||
| Deployment model | Managed API | Managed API | Managed API | Managed API | Managed API | Self-host | |||||||
| Core Technology | |||||||||||||
| Technology | Deep hierarchical Encoders / Fairness & Bias Optimization | Deep Flat Encoders | Deep Flat Encoders | Deep Flat Encoders | Deep Flat Encoders | Deep Flat Encoders | |||||||
| Multilingual & Crosslingual | 43 lang | 100+ lang | 23 lang | 100+ lang | 40 lang | Config | |||||||
| Profile-encoder (Profile ↔ Profile) | Built-in (hierarchical) | Flat text | Flat text | Flat text | Flat text | Flat text | |||||||
| Job-encoder (Job ↔ Job) | Built-in (hierarchical) | Flat text | Flat text | Flat text | Flat text | Flat text | |||||||
| Dual-encoder (Profile ↔ Profile // Job ↔ Job) | Built-in (hierarchical) | Flat text | Flat text | Flat text | Flat text | Flat text | |||||||
| Cross-encoder (Profile ↔ Job) | Built-in (hierarchical) | Flat text | Flat text | Flat text | Flat text | Flat text | |||||||
| White-collar Roles Accuracy | High | Low | Low | Low | Lowest | Very Low | |||||||
| Blue-collar Roles Accuracy | High | Low | Low | Low | Lowest | Very Low | |||||||
| Junior Roles Accuracy | High | Low | Low | Low | Lowest | Very Low | |||||||
| Senior Roles Accuracy | High | Low | Low | Low | Lowest | Very Low | |||||||
| Custom Feature Engineering | Built-in (HR-native) | Config | |||||||||||
| Fairness Regularization | Built-in (Constraints) | ||||||||||||
| Data Calibration & Debiasing | Built-in (Pipeline) | ||||||||||||
| HR Stack integrations (add-ons) | |||||||||||||
| Reasoning & Explainability | Built-in (Reasoning API) | Config | Config | ||||||||||
| Resume, CV, Job parsers | Built-in (Parsing API) | Config | Config | ||||||||||
| HR data enrichment & taxonomies | Built-in (Linking/Tagging/Asking APIs) | ||||||||||||
| Jobboards / ATS / HCM / HRIS connectors | 200+ connectors (Data Studio) | ||||||||||||
| Candidate & Recruiter UI | Widgets (App Studio) | ||||||||||||
Everything you need to know about the Embedding API
Our APIs are designed to complement each other and unlock your data's full potential
Transform HR documents into structured, enriched Talent & Workforce Data — powering every layer of Hiring Intelligence.
API OverviewUnlock Hiring Superintelligence at scale — with transparent, fair, and explainable ranking across every Talent signal.
API Overview
GET STARTED
Start parsing resumes and job postings in minutes with our powerful API.
HrFlow.ai is an API-first company and the leading AI-powered HR data automation platform.
The company helps +1000 customers (HR software vendors, Staffing agencies, large employers, and headhunting firms) to thrive in a high-volume and high-frequency labor market.
The platform provides a complete and fully integrated suite of HR data processing products based on the analysis of hundreds of millions of career paths worldwide -- such as Parsing API, Tagging API, Embedding API, Searching API, Scoring API, and Upskilling API. It also offers a catalog of +200 connectors to build custom scenarios that can automate any business logic.