Datasets:

Einstellung
/

demo-salaries

work_year int64 2.02k 2.02k	experience_level stringclasses 4 values	employment_type stringclasses 4 values	job_title stringlengths 9 40	salary int64 6k 30.4M	salary_currency stringclasses 20 values	salary_in_usd int64 5.13k 450k	employee_residence stringlengths 2 2	remote_ratio int64 0 100	company_location stringlengths 2 2	company_size stringclasses 3 values
2,023	SE	FT	Principal Data Scientist	80,000	EUR	85,847	ES	100	ES	L
2,023	MI	CT	ML Engineer	30,000	USD	30,000	US	100	US	S
2,023	MI	CT	ML Engineer	25,500	USD	25,500	US	100	US	S
2,023	SE	FT	Data Scientist	175,000	USD	175,000	CA	100	CA	M
2,023	SE	FT	Data Scientist	120,000	USD	120,000	CA	100	CA	M
2,023	SE	FT	Applied Scientist	222,200	USD	222,200	US	0	US	L
2,023	SE	FT	Applied Scientist	136,000	USD	136,000	US	0	US	L
2,023	SE	FT	Data Scientist	219,000	USD	219,000	CA	0	CA	M
2,023	SE	FT	Data Scientist	141,000	USD	141,000	CA	0	CA	M
2,023	SE	FT	Data Scientist	147,100	USD	147,100	US	0	US	M
2,023	SE	FT	Data Scientist	90,700	USD	90,700	US	0	US	M
2,023	SE	FT	Data Analyst	130,000	USD	130,000	US	100	US	M
2,023	SE	FT	Data Analyst	100,000	USD	100,000	US	100	US	M
2,023	EN	FT	Applied Scientist	213,660	USD	213,660	US	0	US	L
2,023	EN	FT	Applied Scientist	130,760	USD	130,760	US	0	US	L
2,023	SE	FT	Data Modeler	147,100	USD	147,100	US	0	US	M
2,023	SE	FT	Data Modeler	90,700	USD	90,700	US	0	US	M
2,023	SE	FT	Data Scientist	170,000	USD	170,000	US	0	US	M
2,023	SE	FT	Data Scientist	150,000	USD	150,000	US	0	US	M
2,023	MI	FT	Data Analyst	150,000	USD	150,000	US	100	US	M
2,023	MI	FT	Data Analyst	110,000	USD	110,000	US	100	US	M
2,023	SE	FT	Research Engineer	275,000	USD	275,000	DE	0	DE	M
2,023	SE	FT	Research Engineer	174,000	USD	174,000	DE	0	DE	M
2,023	SE	FT	Analytics Engineer	230,000	USD	230,000	GB	100	GB	M
2,023	SE	FT	Analytics Engineer	143,200	USD	143,200	GB	100	GB	M
2,023	SE	FT	Business Intelligence Engineer	225,000	USD	225,000	US	0	US	M
2,023	SE	FT	Business Intelligence Engineer	156,400	USD	156,400	US	0	US	M
2,023	SE	FT	Machine Learning Engineer	200,000	USD	200,000	US	0	US	M
2,023	SE	FT	Machine Learning Engineer	130,000	USD	130,000	US	0	US	M
2,023	SE	FT	Data Strategist	90,000	USD	90,000	CA	0	CA	M
2,023	SE	FT	Data Strategist	72,000	USD	72,000	CA	0	CA	M
2,023	SE	FT	Data Engineer	253,200	USD	253,200	US	0	US	M
2,023	SE	FT	Data Engineer	90,700	USD	90,700	US	0	US	M
2,023	SE	FT	Computer Vision Engineer	342,810	USD	342,810	US	0	US	M
2,023	SE	FT	Computer Vision Engineer	184,590	USD	184,590	US	0	US	M
2,023	MI	FT	Data Engineer	162,500	USD	162,500	US	0	US	M
2,023	MI	FT	Data Engineer	130,000	USD	130,000	US	0	US	M
2,023	MI	FT	Data Analyst	105,380	USD	105,380	US	0	US	M
2,023	MI	FT	Data Analyst	64,500	USD	64,500	US	0	US	M
2,023	EN	FT	Data Quality Analyst	100,000	USD	100,000	NG	100	NG	L
2,023	EN	FT	Compliance Data Analyst	30,000	USD	30,000	NG	100	NG	L
2,022	MI	FT	Machine Learning Engineer	1,650,000	INR	20,984	IN	50	IN	L
2,023	EN	FT	Applied Scientist	204,620	USD	204,620	US	0	US	L
2,023	EN	FT	Applied Scientist	110,680	USD	110,680	US	0	US	L
2,023	SE	FT	Data Engineer	270,703	USD	270,703	US	0	US	M
2,023	SE	FT	Data Engineer	221,484	USD	221,484	US	0	US	M
2,023	SE	FT	Data Scientist	212,750	USD	212,750	US	100	US	M
2,023	SE	FT	Data Scientist	185,000	USD	185,000	US	100	US	M
2,023	SE	FT	Data Scientist	262,000	USD	262,000	US	100	US	M
2,023	SE	FT	Data Scientist	245,000	USD	245,000	US	100	US	M
2,023	SE	FT	Data Scientist	275,300	USD	275,300	US	100	US	M
2,023	SE	FT	Data Scientist	183,500	USD	183,500	US	100	US	M
2,023	SE	FT	Data Scientist	218,500	USD	218,500	US	100	US	M
2,023	SE	FT	Data Scientist	199,098	USD	199,098	US	100	US	M
2,023	SE	FT	Data Engineer	203,300	USD	203,300	US	100	US	M
2,023	SE	FT	Data Engineer	123,600	USD	123,600	US	100	US	M
2,023	SE	FT	Research Engineer	189,110	USD	189,110	US	0	US	M
2,023	SE	FT	Research Engineer	139,000	USD	139,000	US	0	US	M
2,023	EX	FT	Data Scientist	258,750	USD	258,750	US	0	US	M
2,023	EX	FT	Data Scientist	185,000	USD	185,000	US	0	US	M
2,023	SE	FT	Data Engineer	231,500	USD	231,500	US	100	US	M
2,023	SE	FT	Data Engineer	166,000	USD	166,000	US	100	US	M
2,023	SE	FT	Data Scientist	172,500	USD	172,500	US	100	US	M
2,023	SE	FT	Data Scientist	110,500	USD	110,500	US	100	US	M
2,023	SE	FT	Data Engineer	238,000	USD	238,000	US	0	US	M
2,023	SE	FT	Data Engineer	176,000	USD	176,000	US	0	US	M
2,023	SE	FT	Data Engineer	237,000	USD	237,000	US	100	US	M
2,023	SE	FT	Data Engineer	201,450	USD	201,450	US	100	US	M
2,023	SE	FT	Applied Scientist	309,400	USD	309,400	US	0	US	L
2,023	SE	FT	Applied Scientist	159,100	USD	159,100	US	0	US	L
2,023	SE	FT	Data Engineer	115,000	USD	115,000	US	0	US	M
2,023	SE	FT	Data Engineer	81,500	USD	81,500	US	0	US	M
2,023	SE	FT	Data Scientist	237,000	USD	237,000	US	100	US	M
2,023	SE	FT	Data Scientist	201,450	USD	201,450	US	100	US	M
2,023	SE	FT	Computer Vision Engineer	280,000	USD	280,000	US	0	US	M
2,023	SE	FT	Computer Vision Engineer	210,000	USD	210,000	US	0	US	M
2,023	SE	FT	Data Architect	280,100	USD	280,100	US	100	US	M
2,023	SE	FT	Data Architect	168,100	USD	168,100	US	100	US	M
2,023	SE	FT	Data Engineer	193,500	USD	193,500	US	100	US	M
2,023	SE	FT	Data Engineer	139,000	USD	139,000	US	100	US	M
2,023	MI	FT	Data Scientist	510,000	HKD	65,062	HK	0	HK	L
2,023	SE	FT	Machine Learning Engineer	150,000	USD	150,000	PT	100	US	M
2,023	MI	FT	Applied Machine Learning Engineer	65,000	EUR	69,751	IN	100	DE	S
2,022	EN	FT	AI Developer	300,000	USD	300,000	IN	50	IN	L
2,023	MI	FT	Machine Learning Engineer	90,000	EUR	96,578	NL	100	NL	L
2,023	SE	FT	Business Intelligence Engineer	185,900	USD	185,900	US	0	US	M
2,023	SE	FT	Business Intelligence Engineer	129,300	USD	129,300	US	0	US	M
2,023	SE	FT	Data Engineer	225,000	USD	225,000	US	100	US	M
2,023	SE	FT	Data Engineer	175,000	USD	175,000	US	100	US	M
2,023	SE	FT	Data Engineer	185,000	USD	185,000	US	0	US	M
2,023	SE	FT	Data Engineer	140,000	USD	140,000	US	0	US	M
2,023	SE	FT	Data Scientist	45,000	EUR	48,289	ES	0	ES	M
2,023	SE	FT	Data Scientist	36,000	EUR	38,631	ES	0	ES	M
2,023	SE	FT	Data Scientist	105,000	USD	105,000	US	0	US	M
2,023	SE	FT	Data Scientist	70,000	USD	70,000	US	0	US	M
2,023	EN	FT	Machine Learning Engineer	163,196	USD	163,196	US	0	US	M
2,023	EN	FT	Machine Learning Engineer	145,885	USD	145,885	US	0	US	M
2,023	SE	FT	Data Engineer	217,000	USD	217,000	US	100	US	M
2,023	SE	FT	Data Engineer	185,000	USD	185,000	US	100	US	M
2,023	SE	FT	Data Analyst	202,800	USD	202,800	US	0	US	L

End of preview. Expand in Dataset Viewer.

Dataset Summary

Briefly summarize the dataset, its intended use and the supported tasks. Give an overview of how and why the dataset was created. The summary should explicitly mention the languages present in the dataset (possibly in broad terms, e.g. translations between several pairs of European languages), and describe the domain, topic, or genre covered.

Supported Tasks and Leaderboards

For each of the tasks tagged for this dataset, give a brief description of the tag, metrics, and suggested models (with a link to their HuggingFace implementation if available). Give a similar description of tasks that were not covered by the structured tag set (repace the task-category-tag with an appropriate other:other-task-name).

task-category-tag: The dataset can be used to train a model for [TASK NAME], which consists in [TASK DESCRIPTION]. Success on this task is typically measured by achieving a high/low metric name. The (model name or model class) model currently achieves the following score. [IF A LEADERBOARD IS AVAILABLE]: This task has an active leaderboard which can be found at leaderboard url and ranks models based on metric name while also reporting other metric name.

Languages

Provide a brief overview of the languages represented in the dataset. Describe relevant details about specifics of the language such as whether it is social media text, African American English,...

When relevant, please provide BCP-47 codes, which consist of a primary language subtag, with a script subtag and/or region subtag if available.

Dataset Structure

Data Instances

Provide an JSON-formatted example and brief description of a typical instance in the dataset. If available, provide a link to further examples.

{
  'example_field': ...,
  ...
}

Provide any additional information that is not covered in the other sections about the data here. In particular describe any relationships between data points and if these relationships are made explicit.

Data Fields

List and describe the fields present in the dataset. Mention their data type, and whether they are used as input or output in any of the tasks the dataset currently supports. If the data has span indices, describe their attributes, such as whether they are at the character level or word level, whether they are contiguous or not, etc. If the datasets contains example IDs, state whether they have an inherent meaning, such as a mapping to other datasets or pointing to relationships between data points.

example_field: description of example_field

Note that the descriptions can be initialized with the Show Markdown Data Fields output of the Datasets Tagging app, you will then only need to refine the generated descriptions.

Data Splits

Describe and name the splits in the dataset if there are more than one.

Describe any criteria for splitting the data, if used. If there are differences between the splits (e.g. if the training annotations are machine-generated and the dev and test ones are created by humans, or if different numbers of annotators contributed to each example), describe them here.

Provide the sizes of each split. As appropriate, provide any descriptive statistics for the features, such as average length. For example:

	train	validation	test
Input Sentences
Average Sentence Length

Dataset Creation

Curation Rationale

What need motivated the creation of this dataset? What are some of the reasons underlying the major choices involved in putting it together?

Source Data

This section describes the source data (e.g. news text and headlines, social media posts, translated sentences,...)

Initial Data Collection and Normalization

Describe the data collection process. Describe any criteria for data selection or filtering. List any key words or search terms used. If possible, include runtime information for the collection process.

If data was collected from other pre-existing datasets, link to source here and to their Hugging Face version.

If the data was modified or normalized after being collected (e.g. if the data is word-tokenized), describe the process and the tools used.

Who are the source language producers?

State whether the data was produced by humans or machine generated. Describe the people or systems who originally created the data.

If available, include self-reported demographic or identity information for the source data creators, but avoid inferring this information. Instead state that this information is unknown. See Larson 2017 for using identity categories as a variables, particularly gender.

Describe the conditions under which the data was created (for example, if the producers were crowdworkers, state what platform was used, or if the data was found, what website the data was found on). If compensation was provided, include that information here.

Describe other people represented or mentioned in the data. Where possible, link to references for the information.

Annotations

If the dataset contains annotations which are not part of the initial data collection, describe them in the following paragraphs.

Annotation process

If applicable, describe the annotation process and any tools used, or state otherwise. Describe the amount of data annotated, if not all. Describe or reference annotation guidelines provided to the annotators. If available, provide interannotator statistics. Describe any annotation validation processes.

Who are the annotators?

If annotations were collected for the source data (such as class labels or syntactic parses), state whether the annotations were produced by humans or machine generated.

Describe the people or systems who originally created the annotations and their selection criteria if applicable.

If available, include self-reported demographic or identity information for the annotators, but avoid inferring this information. Instead state that this information is unknown. See Larson 2017 for using identity categories as a variables, particularly gender.

Describe the conditions under which the data was annotated (for example, if the annotators were crowdworkers, state what platform was used, or if the data was found, what website the data was found on). If compensation was provided, include that information here.

Personal and Sensitive Information

State whether the dataset uses identity categories and, if so, how the information is used. Describe where this information comes from (i.e. self-reporting, collecting from profiles, inferring, etc.). See Larson 2017 for using identity categories as a variables, particularly gender. State whether the data is linked to individuals and whether those individuals can be identified in the dataset, either directly or indirectly (i.e., in combination with other data).

State whether the dataset contains other data that might be considered sensitive (e.g., data that reveals racial or ethnic origins, sexual orientations, religious beliefs, political opinions or union memberships, or locations; financial or health data; biometric or genetic data; forms of government identification, such as social security numbers; criminal history).

If efforts were made to anonymize the data, describe the anonymization process.

Considerations for Using the Data

Social Impact of Dataset

Please discuss some of the ways you believe the use of this dataset will impact society.

The statement should include both positive outlooks, such as outlining how technologies developed through its use may improve people's lives, and discuss the accompanying risks. These risks may range from making important decisions more opaque to people who are affected by the technology, to reinforcing existing harmful biases (whose specifics should be discussed in the next section), among other considerations.

Also describe in this section if the proposed dataset contains a low-resource or under-represented language. If this is the case or if this task has any impact on underserved communities, please elaborate here.

Discussion of Biases

Provide descriptions of specific biases that are likely to be reflected in the data, and state whether any steps were taken to reduce their impact.

For Wikipedia text, see for example Dinan et al 2020 on biases in Wikipedia (esp. Table 1), or Blodgett et al 2020 for a more general discussion of the topic.

If analyses have been run quantifying these biases, please add brief summaries and links to the studies here.

Other Known Limitations

If studies of the datasets have outlined other limitations of the dataset, such as annotation artifacts, please outline and cite them here.

Additional Information

Dataset Curators

List the people involved in collecting the dataset and their affiliation(s). If funding information is known, include it here.

Licensing Information

Provide the license and link to the license webpage if available.

Citation Information

Provide the BibTex-formatted reference for the dataset. For example:

@article{article_id,
  author    = {Author List},
  title     = {Dataset Paper Title},
  journal   = {Publication Venue},
  year      = {2525}
}

If the dataset has a DOI, please provide it here.

Contributions

Thanks to @github-username for adding this dataset.

Downloads last month: 298

Homepage:

Add homepage URL here if available (unless it's a GitHub repository)

Repository:

If the dataset is hosted on github or has a github homepage, add URL here

Paper:

If the dataset was introduced by a paper or there was a paper written describing the dataset, add URL here (landing page for Arxiv paper preferred)

Leaderboard:

If the dataset supports an active leaderboard, add link here

Point of Contact:

If known, name and email of at least one person the reader can contact for questions about the dataset.

Size of downloaded dataset files:

210 kB

Size of the auto-converted Parquet files:

51 kB

Number of rows:

3,755