work_year
int64
2.02k
2.02k
experience_level
stringclasses
4 values
employment_type
stringclasses
4 values
job_title
stringlengths
9
40
salary
int64
6k
30.4M
salary_currency
stringclasses
20 values
salary_in_usd
int64
5.13k
450k
employee_residence
stringlengths
2
2
remote_ratio
int64
0
100
company_location
stringlengths
2
2
company_size
stringclasses
3 values
2,023
SE
FT
Principal Data Scientist
80,000
EUR
85,847
ES
100
ES
L
2,023
MI
CT
ML Engineer
30,000
USD
30,000
US
100
US
S
2,023
MI
CT
ML Engineer
25,500
USD
25,500
US
100
US
S
2,023
SE
FT
Data Scientist
175,000
USD
175,000
CA
100
CA
M
2,023
SE
FT
Data Scientist
120,000
USD
120,000
CA
100
CA
M
2,023
SE
FT
Applied Scientist
222,200
USD
222,200
US
0
US
L
2,023
SE
FT
Applied Scientist
136,000
USD
136,000
US
0
US
L
2,023
SE
FT
Data Scientist
219,000
USD
219,000
CA
0
CA
M
2,023
SE
FT
Data Scientist
141,000
USD
141,000
CA
0
CA
M
2,023
SE
FT
Data Scientist
147,100
USD
147,100
US
0
US
M
2,023
SE
FT
Data Scientist
90,700
USD
90,700
US
0
US
M
2,023
SE
FT
Data Analyst
130,000
USD
130,000
US
100
US
M
2,023
SE
FT
Data Analyst
100,000
USD
100,000
US
100
US
M
2,023
EN
FT
Applied Scientist
213,660
USD
213,660
US
0
US
L
2,023
EN
FT
Applied Scientist
130,760
USD
130,760
US
0
US
L
2,023
SE
FT
Data Modeler
147,100
USD
147,100
US
0
US
M
2,023
SE
FT
Data Modeler
90,700
USD
90,700
US
0
US
M
2,023
SE
FT
Data Scientist
170,000
USD
170,000
US
0
US
M
2,023
SE
FT
Data Scientist
150,000
USD
150,000
US
0
US
M
2,023
MI
FT
Data Analyst
150,000
USD
150,000
US
100
US
M
2,023
MI
FT
Data Analyst
110,000
USD
110,000
US
100
US
M
2,023
SE
FT
Research Engineer
275,000
USD
275,000
DE
0
DE
M
2,023
SE
FT
Research Engineer
174,000
USD
174,000
DE
0
DE
M
2,023
SE
FT
Analytics Engineer
230,000
USD
230,000
GB
100
GB
M
2,023
SE
FT
Analytics Engineer
143,200
USD
143,200
GB
100
GB
M
2,023
SE
FT
Business Intelligence Engineer
225,000
USD
225,000
US
0
US
M
2,023
SE
FT
Business Intelligence Engineer
156,400
USD
156,400
US
0
US
M
2,023
SE
FT
Machine Learning Engineer
200,000
USD
200,000
US
0
US
M
2,023
SE
FT
Machine Learning Engineer
130,000
USD
130,000
US
0
US
M
2,023
SE
FT
Data Strategist
90,000
USD
90,000
CA
0
CA
M
2,023
SE
FT
Data Strategist
72,000
USD
72,000
CA
0
CA
M
2,023
SE
FT
Data Engineer
253,200
USD
253,200
US
0
US
M
2,023
SE
FT
Data Engineer
90,700
USD
90,700
US
0
US
M
2,023
SE
FT
Computer Vision Engineer
342,810
USD
342,810
US
0
US
M
2,023
SE
FT
Computer Vision Engineer
184,590
USD
184,590
US
0
US
M
2,023
MI
FT
Data Engineer
162,500
USD
162,500
US
0
US
M
2,023
MI
FT
Data Engineer
130,000
USD
130,000
US
0
US
M
2,023
MI
FT
Data Analyst
105,380
USD
105,380
US
0
US
M
2,023
MI
FT
Data Analyst
64,500
USD
64,500
US
0
US
M
2,023
EN
FT
Data Quality Analyst
100,000
USD
100,000
NG
100
NG
L
2,023
EN
FT
Compliance Data Analyst
30,000
USD
30,000
NG
100
NG
L
2,022
MI
FT
Machine Learning Engineer
1,650,000
INR
20,984
IN
50
IN
L
2,023
EN
FT
Applied Scientist
204,620
USD
204,620
US
0
US
L
2,023
EN
FT
Applied Scientist
110,680
USD
110,680
US
0
US
L
2,023
SE
FT
Data Engineer
270,703
USD
270,703
US
0
US
M
2,023
SE
FT
Data Engineer
221,484
USD
221,484
US
0
US
M
2,023
SE
FT
Data Scientist
212,750
USD
212,750
US
100
US
M
2,023
SE
FT
Data Scientist
185,000
USD
185,000
US
100
US
M
2,023
SE
FT
Data Scientist
262,000
USD
262,000
US
100
US
M
2,023
SE
FT
Data Scientist
245,000
USD
245,000
US
100
US
M
2,023
SE
FT
Data Scientist
275,300
USD
275,300
US
100
US
M
2,023
SE
FT
Data Scientist
183,500
USD
183,500
US
100
US
M
2,023
SE
FT
Data Scientist
218,500
USD
218,500
US
100
US
M
2,023
SE
FT
Data Scientist
199,098
USD
199,098
US
100
US
M
2,023
SE
FT
Data Engineer
203,300
USD
203,300
US
100
US
M
2,023
SE
FT
Data Engineer
123,600
USD
123,600
US
100
US
M
2,023
SE
FT
Research Engineer
189,110
USD
189,110
US
0
US
M
2,023
SE
FT
Research Engineer
139,000
USD
139,000
US
0
US
M
2,023
EX
FT
Data Scientist
258,750
USD
258,750
US
0
US
M
2,023
EX
FT
Data Scientist
185,000
USD
185,000
US
0
US
M
2,023
SE
FT
Data Engineer
231,500
USD
231,500
US
100
US
M
2,023
SE
FT
Data Engineer
166,000
USD
166,000
US
100
US
M
2,023
SE
FT
Data Scientist
172,500
USD
172,500
US
100
US
M
2,023
SE
FT
Data Scientist
110,500
USD
110,500
US
100
US
M
2,023
SE
FT
Data Engineer
238,000
USD
238,000
US
0
US
M
2,023
SE
FT
Data Engineer
176,000
USD
176,000
US
0
US
M
2,023
SE
FT
Data Engineer
237,000
USD
237,000
US
100
US
M
2,023
SE
FT
Data Engineer
201,450
USD
201,450
US
100
US
M
2,023
SE
FT
Applied Scientist
309,400
USD
309,400
US
0
US
L
2,023
SE
FT
Applied Scientist
159,100
USD
159,100
US
0
US
L
2,023
SE
FT
Data Engineer
115,000
USD
115,000
US
0
US
M
2,023
SE
FT
Data Engineer
81,500
USD
81,500
US
0
US
M
2,023
SE
FT
Data Scientist
237,000
USD
237,000
US
100
US
M
2,023
SE
FT
Data Scientist
201,450
USD
201,450
US
100
US
M
2,023
SE
FT
Computer Vision Engineer
280,000
USD
280,000
US
0
US
M
2,023
SE
FT
Computer Vision Engineer
210,000
USD
210,000
US
0
US
M
2,023
SE
FT
Data Architect
280,100
USD
280,100
US
100
US
M
2,023
SE
FT
Data Architect
168,100
USD
168,100
US
100
US
M
2,023
SE
FT
Data Engineer
193,500
USD
193,500
US
100
US
M
2,023
SE
FT
Data Engineer
139,000
USD
139,000
US
100
US
M
2,023
MI
FT
Data Scientist
510,000
HKD
65,062
HK
0
HK
L
2,023
SE
FT
Machine Learning Engineer
150,000
USD
150,000
PT
100
US
M
2,023
MI
FT
Applied Machine Learning Engineer
65,000
EUR
69,751
IN
100
DE
S
2,022
EN
FT
AI Developer
300,000
USD
300,000
IN
50
IN
L
2,023
MI
FT
Machine Learning Engineer
90,000
EUR
96,578
NL
100
NL
L
2,023
SE
FT
Business Intelligence Engineer
185,900
USD
185,900
US
0
US
M
2,023
SE
FT
Business Intelligence Engineer
129,300
USD
129,300
US
0
US
M
2,023
SE
FT
Data Engineer
225,000
USD
225,000
US
100
US
M
2,023
SE
FT
Data Engineer
175,000
USD
175,000
US
100
US
M
2,023
SE
FT
Data Engineer
185,000
USD
185,000
US
0
US
M
2,023
SE
FT
Data Engineer
140,000
USD
140,000
US
0
US
M
2,023
SE
FT
Data Scientist
45,000
EUR
48,289
ES
0
ES
M
2,023
SE
FT
Data Scientist
36,000
EUR
38,631
ES
0
ES
M
2,023
SE
FT
Data Scientist
105,000
USD
105,000
US
0
US
M
2,023
SE
FT
Data Scientist
70,000
USD
70,000
US
0
US
M
2,023
EN
FT
Machine Learning Engineer
163,196
USD
163,196
US
0
US
M
2,023
EN
FT
Machine Learning Engineer
145,885
USD
145,885
US
0
US
M
2,023
SE
FT
Data Engineer
217,000
USD
217,000
US
100
US
M
2,023
SE
FT
Data Engineer
185,000
USD
185,000
US
100
US
M
2,023
SE
FT
Data Analyst
202,800
USD
202,800
US
0
US
L

Dataset Summary

Briefly summarize the dataset, its intended use and the supported tasks. Give an overview of how and why the dataset was created. The summary should explicitly mention the languages present in the dataset (possibly in broad terms, e.g. translations between several pairs of European languages), and describe the domain, topic, or genre covered.

Supported Tasks and Leaderboards

For each of the tasks tagged for this dataset, give a brief description of the tag, metrics, and suggested models (with a link to their HuggingFace implementation if available). Give a similar description of tasks that were not covered by the structured tag set (repace the task-category-tag with an appropriate other:other-task-name).

  • task-category-tag: The dataset can be used to train a model for [TASK NAME], which consists in [TASK DESCRIPTION]. Success on this task is typically measured by achieving a high/low metric name. The (model name or model class) model currently achieves the following score. [IF A LEADERBOARD IS AVAILABLE]: This task has an active leaderboard which can be found at leaderboard url and ranks models based on metric name while also reporting other metric name.

Languages

Provide a brief overview of the languages represented in the dataset. Describe relevant details about specifics of the language such as whether it is social media text, African American English,...

When relevant, please provide BCP-47 codes, which consist of a primary language subtag, with a script subtag and/or region subtag if available.

Dataset Structure

Data Instances

Provide an JSON-formatted example and brief description of a typical instance in the dataset. If available, provide a link to further examples.

{
  'example_field': ...,
  ...
}

Provide any additional information that is not covered in the other sections about the data here. In particular describe any relationships between data points and if these relationships are made explicit.

Data Fields

List and describe the fields present in the dataset. Mention their data type, and whether they are used as input or output in any of the tasks the dataset currently supports. If the data has span indices, describe their attributes, such as whether they are at the character level or word level, whether they are contiguous or not, etc. If the datasets contains example IDs, state whether they have an inherent meaning, such as a mapping to other datasets or pointing to relationships between data points.

  • example_field: description of example_field

Note that the descriptions can be initialized with the Show Markdown Data Fields output of the Datasets Tagging app, you will then only need to refine the generated descriptions.

Data Splits

Describe and name the splits in the dataset if there are more than one.

Describe any criteria for splitting the data, if used. If there are differences between the splits (e.g. if the training annotations are machine-generated and the dev and test ones are created by humans, or if different numbers of annotators contributed to each example), describe them here.

Provide the sizes of each split. As appropriate, provide any descriptive statistics for the features, such as average length. For example:

train validation test
Input Sentences
Average Sentence Length

Dataset Creation

Curation Rationale

What need motivated the creation of this dataset? What are some of the reasons underlying the major choices involved in putting it together?

Source Data

This section describes the source data (e.g. news text and headlines, social media posts, translated sentences,...)

Initial Data Collection and Normalization

Describe the data collection process. Describe any criteria for data selection or filtering. List any key words or search terms used. If possible, include runtime information for the collection process.

If data was collected from other pre-existing datasets, link to source here and to their Hugging Face version.

If the data was modified or normalized after being collected (e.g. if the data is word-tokenized), describe the process and the tools used.

Who are the source language producers?

State whether the data was produced by humans or machine generated. Describe the people or systems who originally created the data.

If available, include self-reported demographic or identity information for the source data creators, but avoid inferring this information. Instead state that this information is unknown. See Larson 2017 for using identity categories as a variables, particularly gender.

Describe the conditions under which the data was created (for example, if the producers were crowdworkers, state what platform was used, or if the data was found, what website the data was found on). If compensation was provided, include that information here.

Describe other people represented or mentioned in the data. Where possible, link to references for the information.

Annotations

If the dataset contains annotations which are not part of the initial data collection, describe them in the following paragraphs.

Annotation process

If applicable, describe the annotation process and any tools used, or state otherwise. Describe the amount of data annotated, if not all. Describe or reference annotation guidelines provided to the annotators. If available, provide interannotator statistics. Describe any annotation validation processes.

Who are the annotators?

If annotations were collected for the source data (such as class labels or syntactic parses), state whether the annotations were produced by humans or machine generated.

Describe the people or systems who originally created the annotations and their selection criteria if applicable.

If available, include self-reported demographic or identity information for the annotators, but avoid inferring this information. Instead state that this information is unknown. See Larson 2017 for using identity categories as a variables, particularly gender.

Describe the conditions under which the data was annotated (for example, if the annotators were crowdworkers, state what platform was used, or if the data was found, what website the data was found on). If compensation was provided, include that information here.

Personal and Sensitive Information

State whether the dataset uses identity categories and, if so, how the information is used. Describe where this information comes from (i.e. self-reporting, collecting from profiles, inferring, etc.). See Larson 2017 for using identity categories as a variables, particularly gender. State whether the data is linked to individuals and whether those individuals can be identified in the dataset, either directly or indirectly (i.e., in combination with other data).

State whether the dataset contains other data that might be considered sensitive (e.g., data that reveals racial or ethnic origins, sexual orientations, religious beliefs, political opinions or union memberships, or locations; financial or health data; biometric or genetic data; forms of government identification, such as social security numbers; criminal history).

If efforts were made to anonymize the data, describe the anonymization process.

Considerations for Using the Data

Social Impact of Dataset

Please discuss some of the ways you believe the use of this dataset will impact society.

The statement should include both positive outlooks, such as outlining how technologies developed through its use may improve people's lives, and discuss the accompanying risks. These risks may range from making important decisions more opaque to people who are affected by the technology, to reinforcing existing harmful biases (whose specifics should be discussed in the next section), among other considerations.

Also describe in this section if the proposed dataset contains a low-resource or under-represented language. If this is the case or if this task has any impact on underserved communities, please elaborate here.

Discussion of Biases

Provide descriptions of specific biases that are likely to be reflected in the data, and state whether any steps were taken to reduce their impact.

For Wikipedia text, see for example Dinan et al 2020 on biases in Wikipedia (esp. Table 1), or Blodgett et al 2020 for a more general discussion of the topic.

If analyses have been run quantifying these biases, please add brief summaries and links to the studies here.

Other Known Limitations

If studies of the datasets have outlined other limitations of the dataset, such as annotation artifacts, please outline and cite them here.

Additional Information

Dataset Curators

List the people involved in collecting the dataset and their affiliation(s). If funding information is known, include it here.

Licensing Information

Provide the license and link to the license webpage if available.

Citation Information

Provide the BibTex-formatted reference for the dataset. For example:

@article{article_id,
  author    = {Author List},
  title     = {Dataset Paper Title},
  journal   = {Publication Venue},
  year      = {2525}
}

If the dataset has a DOI, please provide it here.

Contributions

Thanks to @github-username for adding this dataset.

Downloads last month
298

Space using Einstellung/demo-salaries 1