this post was submitted on 08 Sep 2024
36 points (89.1% liked)

Data is Beautiful

2234 readers
1 users here now

Be respectful

founded 1 year ago
MODERATORS
 

Collected US 2024 tech job postings from Indeed and embedded them with Open AI text embedding large. Reduced dimensionality and clustered via UMAP and HDBSCAN. Topic modeled with Open AI chat API. Visualized with DataMapPlot. Github pages https://hazondata.github.io/ has full interactive map. I also have real-time insights into tech job postings on my site hazon.fyi

https://old.reddit.com/r/dataisbeautiful/comments/1fakvwv/oc_clustering_250k_tech_job_postings_in_2024/

all 2 comments
sorted by: hot top controversial new old
[–] [email protected] 5 points 9 months ago

no wonder it was taking long to load; it's a 58MB HTML file.

really cool stuff though - I'd love to see more information of what's on the screen:

  • Number of postings (updated when filtered using the search);
  • Some way to visualize posts in the intersection of these clusters e.g. Software Dev with Education; AI and DevOps.
  • Word cloud of most common terms in the posting selection;
  • Ways to export the filtered data.