{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# π HRHUB - Complete Bilateral Matching System\n", "\n", "## π― System Architecture:\n", "\n", "```\n", "Candidates (9.5K) ββ Postings (700) ββ Companies (180K)\n", " β β β\n", " Skills text Job requirements Enriched profiles\n", " β β β\n", " Embeddings βββββββ SAME SPACE βΒ³βΈβ΄ ββββββ\n", "```\n", "\n", "## π Key Innovation:\n", "\n", "**Use postings to enrich company profiles** so they speak the same language as candidates!\n", "\n", "- Companies describe: \"We are in tech industry\"\n", "- Postings translate: \"We need Python, AWS, React\"\n", "- Result: Companies can match with candidates!\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## π¦ Step 1: Install & Import" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "β All packages ready!\n" ] } ], "source": [ "#!pip install -q sentence-transformers plotly anthropic scikit-learn umap-learn\n", "\n", "import pandas as pd\n", "import numpy as np\n", "from sentence_transformers import SentenceTransformer\n", "import plotly.express as px\n", "import plotly.graph_objects as go\n", "from sklearn.manifold import TSNE\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "\n", "print(\"β All packages ready!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## π Step 2: Load ALL Datasets" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "π Loading all datasets...\n", "\n", "======================================================================\n", "β Candidates: 9,544 rows Γ 35 columns\n", "β Companies (base): 24,473 rows\n", "β Company industries: 24,375 rows\n", "β Company specialties: 169,387 rows\n", "β Employee counts: 35,787 rows\n", "β Postings: 123,849 rows Γ 31 columns\n", "β Job skills: 213,768 rows\n", "β Job industries: 164,808 rows\n", "\n", "======================================================================\n", "β All datasets loaded!\n", "\n" ] }, { "data": { "text/html": [ "
| \n", " | address | \n", "career_objective | \n", "skills | \n", "educational_institution_name | \n", "degree_names | \n", "passing_years | \n", "educational_results | \n", "result_types | \n", "major_field_of_studies | \n", "professional_company_names | \n", "... | \n", "online_links | \n", "issue_dates | \n", "expiry_dates | \n", "ο»Ώjob_position_name | \n", "educationaL_requirements | \n", "experiencere_requirement | \n", "age_requirement | \n", "responsibilities.1 | \n", "skills_required | \n", "matched_score | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "NaN | \n", "Big data analytics working and database wareho... | \n", "['Big Data', 'Hadoop', 'Hive', 'Python', 'Mapr... | \n", "['The Amity School of Engineering & Technology... | \n", "['B.Tech'] | \n", "['2019'] | \n", "['N/A'] | \n", "[None] | \n", "['Electronics'] | \n", "['Coca-COla'] | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "Senior Software Engineer | \n", "B.Sc in Computer Science & Engineering from a ... | \n", "At least 1 year | \n", "NaN | \n", "Technical Support\\nTroubleshooting\\nCollaborat... | \n", "NaN | \n", "0.850000 | \n", "
| 1 | \n", "NaN | \n", "Fresher looking to join as a data analyst and ... | \n", "['Data Analysis', 'Data Analytics', 'Business ... | \n", "['Delhi University - Hansraj College', 'Delhi ... | \n", "['B.Sc (Maths)', 'M.Sc (Science) (Statistics)'] | \n", "['2015', '2018'] | \n", "['N/A', 'N/A'] | \n", "['N/A', 'N/A'] | \n", "['Mathematics', 'Statistics'] | \n", "['BIB Consultancy'] | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "Machine Learning (ML) Engineer | \n", "M.Sc in Computer Science & Engineering or in a... | \n", "At least 5 year(s) | \n", "NaN | \n", "Machine Learning Leadership\\nCross-Functional ... | \n", "NaN | \n", "0.750000 | \n", "
| 2 | \n", "NaN | \n", "NaN | \n", "['Software Development', 'Machine Learning', '... | \n", "['Birla Institute of Technology (BIT), Ranchi'] | \n", "['B.Tech'] | \n", "['2018'] | \n", "['N/A'] | \n", "['N/A'] | \n", "['Electronics/Telecommunication'] | \n", "['Axis Bank Limited'] | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "Executive/ Senior Executive- Trade Marketing, ... | \n", "Master of Business Administration (MBA) | \n", "At least 3 years | \n", "NaN | \n", "Trade Marketing Executive\\nBrand Visibility, S... | \n", "Brand Promotion\\nCampaign Management\\nField Su... | \n", "0.416667 | \n", "
| 3 | \n", "NaN | \n", "To obtain a position in a fast-paced business ... | \n", "['accounts payables', 'accounts receivables', ... | \n", "['Martinez Adult Education, Business Training ... | \n", "['Computer Applications Specialist Certificate... | \n", "['2008'] | \n", "[None] | \n", "[None] | \n", "['Computer Applications'] | \n", "['Company Name Γ―ΒΌ City , State', 'Company Name... | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "Business Development Executive | \n", "Bachelor/Honors | \n", "1 to 3 years | \n", "Age 22 to 30 years | \n", "Apparel Sourcing\\nQuality Garment Sourcing\\nRe... | \n", "Fast typing skill\\nIELTSInternet browsing & on... | \n", "0.760000 | \n", "
| 4 | \n", "NaN | \n", "Professional accountant with an outstanding wo... | \n", "['Analytical reasoning', 'Compliance testing k... | \n", "['Kent State University'] | \n", "['Bachelor of Business Administration'] | \n", "[None] | \n", "['3.84'] | \n", "[None] | \n", "['Accounting'] | \n", "['Company Name', 'Company Name', 'Company Name... | \n", "... | \n", "[None] | \n", "[None] | \n", "['February 15, 2021'] | \n", "Senior iOS Engineer | \n", "Bachelor of Science (BSc) in Computer Science | \n", "At least 4 years | \n", "NaN | \n", "iOS Lifecycle\\nRequirement Analysis\\nNative Fr... | \n", "iOS\\niOS App Developer\\niOS Application Develo... | \n", "0.650000 | \n", "
5 rows Γ 35 columns
\n", "| \n", " | address | \n", "career_objective | \n", "skills | \n", "educational_institution_name | \n", "degree_names | \n", "passing_years | \n", "educational_results | \n", "result_types | \n", "major_field_of_studies | \n", "professional_company_names | \n", "... | \n", "online_links | \n", "issue_dates | \n", "expiry_dates | \n", "ο»Ώjob_position_name | \n", "educationaL_requirements | \n", "experiencere_requirement | \n", "age_requirement | \n", "responsibilities.1 | \n", "skills_required | \n", "matched_score | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "\n", " | Big data analytics working and database wareho... | \n", "['Big Data', 'Hadoop', 'Hive', 'Python', 'Mapr... | \n", "['The Amity School of Engineering & Technology... | \n", "['B.Tech'] | \n", "['2019'] | \n", "['N/A'] | \n", "[None] | \n", "['Electronics'] | \n", "['Coca-COla'] | \n", "... | \n", "\n", " | \n", " | \n", " | Senior Software Engineer | \n", "B.Sc in Computer Science & Engineering from a ... | \n", "At least 1 year | \n", "\n", " | Technical Support\\nTroubleshooting\\nCollaborat... | \n", "\n", " | 0.850000 | \n", "
| 1 | \n", "\n", " | Fresher looking to join as a data analyst and ... | \n", "['Data Analysis', 'Data Analytics', 'Business ... | \n", "['Delhi University - Hansraj College', 'Delhi ... | \n", "['B.Sc (Maths)', 'M.Sc (Science) (Statistics)'] | \n", "['2015', '2018'] | \n", "['N/A', 'N/A'] | \n", "['N/A', 'N/A'] | \n", "['Mathematics', 'Statistics'] | \n", "['BIB Consultancy'] | \n", "... | \n", "\n", " | \n", " | \n", " | Machine Learning (ML) Engineer | \n", "M.Sc in Computer Science & Engineering or in a... | \n", "At least 5 year(s) | \n", "\n", " | Machine Learning Leadership\\nCross-Functional ... | \n", "\n", " | 0.750000 | \n", "
| 2 | \n", "\n", " | \n", " | ['Software Development', 'Machine Learning', '... | \n", "['Birla Institute of Technology (BIT), Ranchi'] | \n", "['B.Tech'] | \n", "['2018'] | \n", "['N/A'] | \n", "['N/A'] | \n", "['Electronics/Telecommunication'] | \n", "['Axis Bank Limited'] | \n", "... | \n", "\n", " | \n", " | \n", " | Executive/ Senior Executive- Trade Marketing, ... | \n", "Master of Business Administration (MBA) | \n", "At least 3 years | \n", "\n", " | Trade Marketing Executive\\nBrand Visibility, S... | \n", "Brand Promotion\\nCampaign Management\\nField Su... | \n", "0.416667 | \n", "
3 rows Γ 35 columns
\n", "| \n", " | candidate_id | \n", "company_id | \n", "company_name | \n", "rank | \n", "similarity_score | \n", "required_skills | \n", "posted_jobs | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "0 | \n", "72825 | \n", "Anblicks | \n", "1 | \n", "0.702806 | \n", "\n", " | \n", " |
| 1 | \n", "0 | \n", "65529778 | \n", "iO Associates - US | \n", "2 | \n", "0.702621 | \n", "\n", " | \n", " |
| 2 | \n", "0 | \n", "72825 | \n", "Anblicks | \n", "3 | \n", "0.702572 | \n", "\n", " | \n", " |
| 3 | \n", "0 | \n", "65529778 | \n", "iO Associates - US | \n", "4 | \n", "0.701938 | \n", "\n", " | \n", " |
| 4 | \n", "0 | \n", "72825 | \n", "Anblicks | \n", "5 | \n", "0.701032 | \n", "\n", " | \n", " |
| 5 | \n", "0 | \n", "33307792 | \n", "DATAECONOMY | \n", "6 | \n", "0.684871 | \n", "\n", " | \n", " |
| 6 | \n", "0 | \n", "323777 | \n", "Datavail | \n", "7 | \n", "0.682659 | \n", "\n", " | \n", " |
| 7 | \n", "0 | \n", "33307792 | \n", "DATAECONOMY | \n", "8 | \n", "0.680029 | \n", "\n", " | \n", " |
| 8 | \n", "0 | \n", "33307792 | \n", "DATAECONOMY | \n", "9 | \n", "0.678448 | \n", "\n", " | \n", " |
| 9 | \n", "0 | \n", "1016007 | \n", "BitPusher | \n", "10 | \n", "0.677616 | \n", "\n", " | \n", " |
| 10 | \n", "1 | \n", "98704 | \n", "Analytic Recruiting Inc. | \n", "1 | \n", "0.620458 | \n", "\n", " | \n", " |
| 11 | \n", "1 | \n", "2681218 | \n", "Logikk | \n", "2 | \n", "0.620197 | \n", "\n", " | \n", " |
| 12 | \n", "1 | \n", "47591650 | \n", "Heliosz.AI | \n", "3 | \n", "0.589462 | \n", "\n", " | \n", " |
| 13 | \n", "1 | \n", "1092280 | \n", "BPO Recruit | \n", "4 | \n", "0.576554 | \n", "\n", " | \n", " |
| 14 | \n", "1 | \n", "28156433 | \n", "IntellectFaces, Inc | \n", "5 | \n", "0.575582 | \n", "\n", " | \n", " |
| 15 | \n", "1 | \n", "90406839 | \n", "Vedan Technologies | \n", "6 | \n", "0.566410 | \n", "\n", " | \n", " |
| 16 | \n", "1 | \n", "575811 | \n", "Burtch Works | \n", "7 | \n", "0.565807 | \n", "\n", " | \n", " |
| 17 | \n", "1 | \n", "2319092 | \n", "Cleartelligence | \n", "8 | \n", "0.560748 | \n", "\n", " | \n", " |
| 18 | \n", "1 | \n", "13423341 | \n", "Ampstek | \n", "9 | \n", "0.545455 | \n", "\n", " | \n", " |
| 19 | \n", "1 | \n", "13423341 | \n", "Ampstek | \n", "10 | \n", "0.545455 | \n", "\n", " | \n", " |