Skip to content

Latest commit

 

History

History

notebooks

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

This folder contains notebooks showcasing concepts covered in the book. Most of the examples only use one of the subfolders in archive (the one that contains data for writers.stackexchange.com).

I've included a processed version of the data as a .csv for convenience.

If you want to generate this data yourself, or generate it for another subfolder, you should:

  • Download a subfolder from the stackoverflow archives

  • Run parse_xml_to_csv to convert it to a DataFrame

  • Run generate_model_text_features to generate a DataFrames with precomputed features

The notebooks belong to a few categories of concepts, described below.

Data Exploration and Transformation

Initial Model Training and Performance Analysis

Improving the Model

Model Comparison

Generating Suggestions from Models