Skip to content

Commit

Permalink
revised graphics
Browse files Browse the repository at this point in the history
  • Loading branch information
dspiegel29 committed Aug 12, 2019
1 parent 3a15a66 commit 42029fd
Show file tree
Hide file tree
Showing 21 changed files with 936 additions and 52 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,23 @@

Data are shown in Table 1.1 (page 23) and are contained in [01-1-child-heart-survival-x.csv](01-1-child-heart-survival-x.csv). The data were originally presented in the [NCHDA 2012-15 report](https://nicor4.nicor.org.uk/chd/an_paeds.nsf/vwContent/Analysis%20Documents?Opendocument), but are best seen on [childrensheartsurgery.info](http://childrensheartsurgery.info/).

```{r figure 1-1}
```{r}
library(ggplot2)
ThirtyDaySurv <-read.csv("01-1-child-heart-survival-x.csv", header=TRUE) # reads data into ThirtyDaySurv data frame
nhosp=length(ThirtyDaySurv$Hospital)
attach(ThirtyDaySurv)
nhosp=length(Hospital)
```
First in R base graphics
```{r}
par(mar=c(5,15,4,2))
barplot(ThirtyDaySurvival,names.arg=Hospital,horiz=T,xlim=c(86,100), xpd=F,las=1, xlab="% surviving 30 days")
```


Now in ggplot2
```{r}
p <- ggplot(ThirtyDaySurv, aes(x=reorder(Hospital,nhosp:1), y= ThirtyDaySurvival, fill=Hospital)) # constructs initial plot object, , starting with top row
p <- p + geom_bar(stat = "identity") # assigns bar chart-type
p <- p + coord_flip(ylim = c(86,100)) # flips to horizontal bars and limits y-axis
Expand All @@ -26,4 +37,16 @@ p # draws the plot

_Figure 1.1 Bar-chart of 30-day survival rates for thirteen hospitals. The choice of the start of the horizontal axis, here 86%, can have a crucial effect on the impression given by the graphic. If the axis starts at 0%, all the hospitals will look indistinguishable, whereas if we started at 95% the differences would look misleadingly dramatic._

For other ways of displaying and explaining this data, and more recent results, see [childrensheartsurgery.info](http://childrensheartsurgery.info/).
For other ways of displaying and explaining this data, and more recent results, see [childrensheartsurgery.info](http://childrensheartsurgery.info/), which uses a dot-plot. This may be more appropriate.

```{r figure 1-1}
p <- ggplot(ThirtyDaySurv, aes(x=reorder(Hospital,nhosp:1), y= ThirtyDaySurvival, fill=Hospital)) # constructs initial plot object, , starting with top row
p <- p + geom_dotplot(binaxis="y",stackdir="center",dotsize=2) # assigns dots chart-type
p <- p + coord_flip(ylim = c(86,100)) # flips to horizontal bars and limits y-axis
p <- p + scale_y_continuous(breaks=seq(86, 100, 2)) # assigns breaks every 2 percent
p <- p + theme(legend.position="none") # removes the legend
p <- p + labs(x="", y="% surviving 30 days") # Adds y-axis label
p # draws the plot
```

_Figure 1.1 Bar-chart of 30-day survival rates for thirteen hospitals represented as a dot-plot, in which the non-zero axis is less important as, unlike a bar, it is not directly connected to the data-point._

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,18 @@ library(ggplot2)
df <- read.csv("01-1-child-heart-survival-x.csv", header=TRUE) # reads csv into dataframe, df
df$Percentage = 100*df$Operations/sum(df$Operations)
df$Pos= rank(df$Percentage)
```
First in R base graphics

```{r}
par(mar=c(5,15,4,2))
barplot(df$Percentage,names.arg=df$Hospital,horiz=T, xpd=F,las=1, xlab="Percentage of all operations in 2012-15 \nthat are carried out in each hospital")
```

Now in ggplot2

```{r}
bp <- ggplot(df, aes(x=reorder(Hospital,-Pos), y=Percentage, fill=Hospital)) #sets initial plot object from the dataframe for Hospitals, reordered by Percentage (descending) as the y-values, colour-filled by Hospital
bp <- bp + geom_bar(stat = "identity") + labs(x="Hospital") # makes the plot a bar-chart
bp <- bp + coord_flip() # makes it an horizontal bar chart
Expand Down

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion 02-4-reported-partners/02-4-sexual-partners-x.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,4 @@ p # draw the plot
```

Figure 2.4 Data provided by Natsal-3 based on interviews between 2010 and 2012. The series have been truncated at 50 for reasons of space - the totals go up to 500 for both men and women. Note the clear use of round numbers for ten or more partners, and the tendency for men to report more partners than women.
_Figure 2.4 Data provided by Natsal-3 based on interviews between 2010 and 2012. The series have been truncated at 50 for reasons of space - the totals go up to 500 for both men and women. Note the clear use of round numbers for ten or more partners, and the tendency for men to report more partners than women._
2 changes: 1 addition & 1 deletion 02-4-reported-partners/02-4-sexual-partners-x.html

Large diffs are not rendered by default.

21 changes: 17 additions & 4 deletions 02-5-survival-vs-numbers/02-5-child-heart-surgery-x.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,32 @@ Data from 2012-2014 were shown in Table 1.1 (page 23) and are contained in [02-5

### Figure 2.5 (page 57) Scatterplots

```{r}
Quickly in qplot

```{r}
library(ggplot2)
# (a) Survival in under-1s, 1991-1995
child.1991 <- read.csv("02-5-child-heart-surgery-1991-x.csv") # read data into dataframe
attach(child.1991)
qplot(Operations,100*Survivors/Operations,xlim = c(0,700),ylim=c(70,100),ylab = "% 30-day survival", label = Hospital, geom=c("point", "text"),hjust=1, vjust=-1,size=2, main = "(a) Survival in under-1s, 1991-1995") + theme(legend.position="none")
# (a) Survival in under-1s, 1991-1995
#(b) Survival for all children, 2012-2015
child.2012 <- read.csv("02-5-child-heart-surgery-2012-x.csv") # read data into dataframe all
attach(child.2012)
qplot(Operations,100*Survivors/Operations,xlim = c(0,2000),ylim=c(95,100),ylab = "% 30-day survival", label = Hospital, geom=c("point", "text"),hjust=1, vjust=-1,size=2, main = "(b) Survival for all children, 2012-2015") + theme(legend.position="none")
```

```{r}
p <- ggplot(child.1991, aes(x=Operations, y=100*Survivors/Operations, col=Hospital)) #defines plot axis data fields and colour legend data field
p <- p + geom_point(aes(size=1.5)) # defines scatter-type plot
p <- p + expand_limits(x = c(0,700),y=c(70,100))
p <- p + scale_size_continuous(name = "Size", guide = FALSE) # turns off otherwise added size legend
p <- p + labs(x="Number of operations", y = "% 30-day survival", title="(a) Survival in under-1s, 1991-1995") # Adds title, subtitle, and caption
p <- p + theme(plot.caption=element_text(hjust = 0.5)) # centre justifies the caption
p
#(b) Survival for all children, 2012-2015
child.2012 <- read.csv("02-5-child-heart-surgery-2012-x.csv") # read data into dataframe all
q <- ggplot(child.2012, aes(x=Operations, y=100*Survivors/Operations,col=Hospital)) #defines plot axis data fields and colour legend data field
Expand All @@ -35,8 +48,8 @@ q
```

Figure 2.5
Scatter-plots of survival rates against number of operations in child heart surgery. For (a) 1991-1995, the Pearson correlation is 0.59 and the rank correlation is 0.85, for (b) 2012-2015, the Pearson correlation is 0.17 and the rank correlation is -0:03.
_Figure 2.5
Scatter-plots of survival rates against number of operations in child heart surgery. For (a) 1991-1995, the Pearson correlation is 0.59 and the rank correlation is 0.85, for (b) 2012-2015, the Pearson correlation is 0.17 and the rank correlation is -0:03._


### Correlations in (a) 1991-1995 data
Expand Down
42 changes: 35 additions & 7 deletions 02-5-survival-vs-numbers/02-5-child-heart-surgery-x.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion 02-6-zero-correlations/02-6-zero-correlations-x.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,4 @@ p
```

Figure 2.6 Two sets of (fictitious) data-points for which the Pearson correlation coefficients are both 0. This clearly does not mean there is no relationship between the two variables being plotted.
_Figure 2.6 Two sets of (fictitious) data-points for which the Pearson correlation coefficients are both 0. This clearly does not mean there is no relationship between the two variables being plotted._
2 changes: 1 addition & 1 deletion 02-6-zero-correlations/02-6-zero-correlations-x.html

Large diffs are not rendered by default.

Loading

0 comments on commit 42029fd

Please sign in to comment.