| Type: | Package |
| Title: | Maps, Data and Methods Related to Guerry (1833) "Moral Statistics of France" |
| Version: | 1.8.3 |
| Date: | 2023-10-24 |
| Maintainer: | Michael Friendly <friendly@yorku.ca> |
| Encoding: | UTF-8 |
| Language: | en-US |
| Depends: | R (≥ 3.5.0) |
| Imports: | sp |
| Suggests: | knitr, sf, spdep, ade4, adegraphics, adespatial, RColorBrewer, corrgram, car, effects, rmarkdown, here, ggplot2, ggpcp, ggrepel, heplots, patchwork, candisc, colorspace, scales, remotes, dplyr, tidyr |
| Description: | Maps of France in 1830, multivariate datasets from A.-M. Guerry and others, and statistical and graphic methods related to Guerry's "Moral Statistics of France". The goal is to facilitate the exploration and development of statistical and graphic methods for multivariate data in a geospatial context of historical interest. |
| License: | GPL-2 | GPL-3 [expanded from: GPL] |
| URL: | https://github.com/friendly/Guerry |
| BugReports: | https://github.com/friendly/Guerry/issues |
| LazyLoad: | yes |
| LazyData: | yes |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2023-10-24 16:55:48 UTC; friendly |
| Author: | Michael Friendly |
| Repository: | CRAN |
| Date/Publication: | 2023-10-24 20:20:10 UTC |
Maps, Data and Methods Related to Guerry (1833) "Moral Statistics of France"
Description
Andre-Michel Guerry (1833) was the first to systematically collect and analyze social data on such things as crime, literacy and suicide with the view to determining social laws and the relations among these variables. He provided the first essentially multivariate and georeferenced spatial data on socially important questions, e.g., Is the rate of crime related to education or literacy? How does this vary over the departments of France? Are the rates of crime or suicide within departments stable over time?
In an age well before the idea of correlation had been invented, Guerry used graphics and statistical maps to try to shed light on such questions. In a later work (Guerry, 1864), he explicitly tried to entertain larger questions, but with still-limited statistical tools: Can rates of various crimes be related to multiple causes or predictors? Are the rates and ascribable causes in France similar or different to those found in England?
The Guerry package comprises maps of France in 1830, multivariate data from A.-M. Guerry and others (Angeville, 1836), and statistical and graphic methods related to Guerry's Moral Statistics of France. The goal of providing these as an R package is to facilitate the exploration and development of statistical and graphic methods for multivariate data in a geo-spatial context.
Details
The DESCRIPTION file:
| Package: | Guerry |
| Type: | Package |
| Title: | Maps, Data and Methods Related to Guerry (1833) "Moral Statistics of France" |
| Version: | 1.8.3 |
| Date: | 2023-10-24 |
| Authors@R: | c( person(given = "Michael", family = "Friendly", role=c("aut", "cre"), email="friendly@yorku.ca", comment = c(ORCID = "0000-0002-3237-0941")), person(given = "Stephane", family = "Dray", role="aut", email="stephane.dray@univ-lyon1.fr", comment = c(ORCID = "0000-0003-0153-1105")), person(given = "Roger", family = "Bivand", role="ctb", email = "Roger.Bivand@nhh.no") ) |
| Maintainer: | Michael Friendly <friendly@yorku.ca> |
| Encoding: | UTF-8 |
| Language: | en-US |
| Depends: | R (>= 3.5.0) |
| Imports: | sp |
| Suggests: | knitr, sf, spdep, ade4, adegraphics, adespatial, RColorBrewer, corrgram, car, effects, rmarkdown, here, ggplot2, ggpcp, ggrepel, heplots, patchwork, candisc, colorspace, scales, remotes, dplyr, tidyr |
| Description: | Maps of France in 1830, multivariate datasets from A.-M. Guerry and others, and statistical and graphic methods related to Guerry's "Moral Statistics of France". The goal is to facilitate the exploration and development of statistical and graphic methods for multivariate data in a geospatial context of historical interest. |
| License: | GPL |
| URL: | https://github.com/friendly/Guerry |
| BugReports: | https://github.com/friendly/Guerry/issues |
| LazyLoad: | yes |
| LazyData: | yes |
| VignetteBuilder: | knitr |
| Author: | Michael Friendly [aut, cre] (<https://orcid.org/0000-0002-3237-0941>), Stephane Dray [aut] (<https://orcid.org/0000-0003-0153-1105>), Roger Bivand [ctb] |
Index of help topics:
Angeville Data from d'Angeville (1836) on the population
of France
Guerry Data from A.-M. Guerry, "Essay on the Moral
Statistics of France"
Guerry-package Maps, Data and Methods Related to Guerry (1833)
"Moral Statistics of France"
gfrance Map of France in 1830 with the Guerry data
gfrance85 Map of France in 1830 with the Guerry data,
excluding Corsica
propensity Distribution of crimes against persons at
different ages
Data from Guerry and others is contained in the data frame Guerry.
Because Corsica is often considered an outlier both spatially and
statistically, the map of France circa 1830, together with the Guerry
data is provided as SpatialPolygonsDataFrames
in two forms:
gfrance for all 86 departments, and
and gfrance85, for the 85 departments excluding Corsica.
Author(s)
NA
Maintainer: Michael Friendly <friendly@yorku.ca>
References
d'Angeville, A. (1836). Essai sur la Statistique de la Population francaise, Paris: F. Darfour.
Dray, S. and Jombart, T. (2011). A Revisit Of Guerry's Data: Introducing Spatial Constraints In Multivariate Analysis. The Annals of Applied Statistics, 5(4).
Brunsdon, C. and Dykes, J. (2007). Geographically weighted visualization: interactive graphics for scale-varying exploratory analysis. Geographical Information Science Research Conference (GISRUK 2007). NUI Maynooth, Ireland, April, 2007. https://www.maynoothuniversity.ie/national-centre-geocomputation-ncg.
Friendly, M. (2007). A.-M. Guerry's Moral Statistics of France: Challenges for Multivariable Spatial Analysis. Statistical Science, 22, 368-399. http://www.datavis.ca/papers/guerry-STS241.pdf
Friendly, M. (2007). Supplementary materials for Andre-Michel Guerry's Moral Statistics of France: Challenges for Multivariate Spatial Analysis, http://www.datavis.ca/gallery/guerry/.
Friendly, M. (2022). The life and works of Andre-Michel Guerry, revisited. Sociological Spectrum, 42, 233–259. doi:10.1080/02732173.2022.2078450
Guerry, A.-M. (1833). Essai sur la statistique morale de la France Paris: Crochard. English translation: Hugh P. Whitt and Victor W. Reinking, Lewiston, N.Y.: Edwin Mellen Press, 2002.
Guerry, A.-M. (1864). Statistique morale de l'Angleterre compar?e avec la statistique morale de la France, d'apres les comptes de l'administration de la justice criminelle en Angleterre et en France, etc. Paris: J.-B. Bailliere et fils.
Data from d'Angeville (1836) on the population of France
Description
Adolph d'Angeville (1836) presented a comprehensive statistical summary of nearly every known measurable characteristic of the French population (by department) in his Essai sur la Statistique de la Population francaise. Using the graphic method of shaded (choropleth) maps invented by Baron Charles Dupin and applied to significant social questions by Guerry, Angeville's Essai became the first broad and general application of principles of graphic representation to national industrial and population data.
The collection of variables in the data frame Angeville
is a small subset of over 120 columns presented in 8 tables and many
graphic maps.
Usage
data(Angeville)
Format
A data frame with 86 observations on the following 16 variables.
depta numeric vector
DepartmentDepartment name: a factor with levels
AinAisne...VosgesYonneMortalityMortality: Number of births to give 100 people at age 21 (T1:13)
MarriagesNumber of marriages per 1000 men aged 21 (T1:15)
Legit_birthsAnnual no. of legitimate births (T2:17)
Illeg_birthsAnnual no. of illegitimate births (T2:18)
RecruitsNumber of people registered for military recruitment from 1825-1833 (T3:32)
ConscriptsNumber of inhabitants per military conscript (T3:33)
ExemptionsNumber of military exemptions per 1000 all of physical causes (T3:47)
FarmersNumber of farmers during the census in 1831 (T4:65)
Recruits_ignorantAverage number of ignorant recruits per 1000 (T5:69)
SchoolchildrenNumber of schoolchildren per 1000 inhabitants (T5:71)
Windows_doorsNumber of windows & doors in houses per 100 inhabitants (T5:72). This is sometimes taken as an indicator of household wealth.
Primary_schools"Number of primary schools (T5:74)
Life_expLife expectancy in years (T1:9a,9b)
Pop1831Population in 1831
Details
ID codes for dept were modified from those in Angeville's tables
to match those used in Guerry.
Angeville's variables are recorded in a variety of different ways and some of these were calculated from other columns in his tables not included here. As well, the variable names and labels used here were often shortened from the more complete descriptions given by d'Angeville. The notation "(Tn:k)" indicates that the variable used here came from Table n, Column k.
Source
Angeville, A. d' (1836). Essai sur la Statistique de la Population francaise, Paris: F. Darfour.
The data was digitally scanned from Angeville's tables using OCR software, then extensively edited to correct obvious errors and finally subjected to some consistency checks using the column totals and ranked values he provided.
References
Whitt, H. P. (2007). Modernism, internal colonialism, and the direction of violence: suicide and crimes against persons in France, 1825-1830. Unpublished ms.
Examples
library(Guerry)
library(sp)
library(RColorBrewer)
data(Guerry)
data(gfrance)
data(Angeville)
gf <- gfrance # the SpatialPolygonsDataFrame
# Add some Angeville variables, transform them to ranks
gf$Mortality <- rank(Angeville$Mortality)
gf$Marriages <- rank(Angeville$Marriages)
gf$Legit_births <- rank(Angeville$Legit_births)
gf$Illeg_births <- rank(Angeville$Illeg_births)
gf$Farmers <- rank(Angeville$Farmers)
gf$Schoolchildren <- rank(Angeville$Schoolchildren)
# plot them on map of France
my.palette <- rev(brewer.pal(n = 9, name = "PuBu"))
spplot(gf,
c("Mortality", "Marriages", "Legit_births", "Illeg_births", "Farmers", "Schoolchildren"),
names.attr = c("Mortality", "Marriages", "Legit_births",
"Illeg_births", "Farmers", "Schoolchildren"),
layout=c(3,2),
as.table=TRUE,
col.regions = my.palette,
cuts = 8, # col = "transparent",
main="Angeville variables")
Data from A.-M. Guerry, "Essay on the Moral Statistics of France"
Description
Andre-Michel Guerry (1833) was the first to systematically collect and analyze social data on such things as crime, literacy and suicide with the view to determining social laws and the relations among these variables.
The Guerry data frame comprises a collection of 'moral variables' on the 86 departments of France around 1830. A few additional variables have been added from other sources.
Usage
data(Guerry)
Format
A data frame with 86 observations (the departments of France) on the following 23 variables.
deptDepartment ID: Standard numbers for the departments, except for Corsica (200)
RegionRegion of France ('N'='North', 'S'='South', 'E'='East', 'W'='West', 'C'='Central'). Corsica is coded as NA
DepartmentDepartment name: Departments are named according to usage in 1830, but without accents. A factor with levels
AinAisneAllier...VosgesYonneCrime_persPopulation per Crime against persons. Source: A2 (Comptes general, 1825-1830)
Crime_propPopulation per Crime against property. Source: A2 (Compte general, 1825-1830)
LiteracyPercent Read & Write: Percent of military conscripts who can read and write. Source: A2
DonationsDonations to the poor. Source: A2 (Bulletin des lois)
InfantsPopulation per illegitimate birth. Source: A2 (Bureau des Longitudes, 1817-1821)
SuicidesPopulation per suicide. Source: A2 (Compte general, 1827-1830)
MainCitySize of principal city ('1:Sm', '2:Med', '3:Lg'), used as a surrogate for population density. Large refers to the top 10, small to the bottom 10; all the rest are classed Medium. Source: A1. An ordered factor with levels
1:Sm<2:Med<3:LgWealthPer capita tax on personal property. A ranked index based on taxes on personal and movable property per inhabitant. Source: A1
CommerceCommerce and Industry, measured by the rank of the number of patents / population. Source: A1
ClergyDistribution of clergy, measured by the rank of the number of Catholic priests in active service / population. Source: A1 (Almanach officiel du clergy, 1829)
Crime_parentsCrimes against parents, measured by the rank of the ratio of crimes against parents to all crimes– Average for the years 1825-1830. Source: A1 (Compte general)
InfanticideInfanticides per capita. A ranked ratio of number of infanticides to population– Average for the years 1825-1830. Source: A1 (Compte general)
Donation_clergyDonations to the clergy. A ranked ratio of the number of bequests and donations inter vivios to population– Average for the years 1815-1824. Source: A1 (Bull. des lois, ordunn. d'autorisation)
LotteryPer capita wager on Royal Lottery. Ranked ratio of the proceeds bet on the royal lottery to population— Average for the years 1822-1826. Source: A1 (Compte rendus par le ministre des finances)
DesertionMilitary desertion, ratio of the number of young soldiers accused of desertion to the force of the military contingent, minus the deficit produced by the insufficiency of available billets– Average of the years 1825-1827. Source: A1 (Compte du ministere du guerre, 1829 etat V)
InstructionInstruction. Ranks recorded from Guerry's map of Instruction. Note: this is inversely related to
Literacy(as defined here)ProstitutesProstitutes in Paris. Number of prostitutes registered in Paris from 1816 to 1834, classified by the department of their birth Source: Parent-Duchatelet (1836), De la prostitution en Paris
DistanceDistance to Paris (km). Distance of each department centroid to the centroid of the Seine (Paris) Source: calculated from department centroids
AreaArea (1000 km^2). Source: Angeville (1836)
Pop18311831 population. Population in 1831, taken from Angeville (1836), Essai sur la Statistique de la Population francaise, in 1000s
Details
Note that most of the variables (e.g., Crime_pers) are scaled so that 'more is better' morally.
Values for the quantitative variables displayed on Guerry's maps were taken from Table A2 in the English translation of Guerry (1833) by Whitt and Reinking. Values for the ranked variables were taken from Table A1, with some corrections applied. The maximum is indicated by rank 1, and the minimum by rank 86.
Source
Angeville, A. (1836). Essai sur la Statistique de la Population fran?aise Paris: F. Doufour.
Guerry, A.-M. (1833). Essai sur la statistique morale de la France Paris: Crochard. English translation: Hugh P. Whitt and Victor W. Reinking, Lewiston, N.Y. : Edwin Mellen Press, 2002.
Parent-Duchatelet, A. (1836). De la prostitution dans la ville de Paris, 3rd ed, 1857, p. 32, 36
References
Dray, S., & Jombart, T. (2011). Revisiting Guerry's data: Introducing spatial constraints in multivariate analysis. Annals of Applied Statistics, 5, 2278-2299
Brunsdon, C. and Dykes, J. (2007). Geographically weighted visualization: Interactive graphics for scale-varying exploratory analysis. Geographical Information Science Research Conference (GISRUK 07), NUI Maynooth, Ireland, April, 2007.
Friendly, M. (2007). A.-M. Guerry's Moral Statistics of France: Challenges for Multivariable Spatial Analysis. Statistical Science, 22, 368-399.
Friendly, M. (2007). Data from A.-M. Guerry, Essay on the Moral Statistics of France (1833), https://www.datavis.ca/gallery/guerry/guerrydat.html.
See Also
Angeville for other analysis variables
Examples
library(car)
data(Guerry)
# Is there a relation between crime and literacy?
# Plot personal crime rate vs. literacy, using data ellipses.
# Identify the departments that stand out
set.seed(12315)
with(Guerry,{
dataEllipse(Literacy, Crime_pers,
levels = 0.68,
ylim = c(0,40000), xlim = c(0, 80),
ylab="Pop. per crime against persons",
xlab="Percent who can read & write",
pch = 16,
grid = FALSE,
id = list(method="mahal", n = 8, labels=Department, location="avoid", cex=1.2),
center.pch = 3, center.cex=5,
cex.lab=1.5)
# add a 95% ellipse
dataEllipse(Literacy, Crime_pers,
levels = 0.95, add=TRUE,
ylim = c(0,40000), xlim = c(0, 80),
lwd=2, lty="longdash",
col="gray",
center.pch = FALSE
)
# add the LS line and a loess smooth.
abline( lm(Crime_pers ~ Literacy), lwd=2)
lines(loess.smooth(Literacy, Crime_pers), col="red", lwd=3)
}
)
# A corrgram to show the relations among the main moral variables
# Re-arrange variables by PCA ordering.
library(corrgram)
corrgram(Guerry[,4:9], upper=panel.ellipse, order=TRUE)
Map of France in 1830 with the Guerry data
Description
gfrance is a SpatialPolygonsDataFrame object created with the
sp package, containing the polygon boundaries of the map of
France as it was in 1830, together with the Guerry
data frame.
Usage
data(gfrance)
Format
The format is: Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots:
-
gfrance@data, -
gfrance@polygons, -
gfrance@plotOrder, -
gfrance@bbox, -
gfrance@proj4string.
See: SpatialPolygonsDataFrame for descriptions of some components.
The analysis variables, represented in gfrance@data are described in Guerry.
Details
In the present version, the PROJ4 projection is not specified.
Source
Friendly, M. (2007). Supplementary materials for Andre-Michel Guerry's Moral Statistics of France: Challenges for Multivariate Spatial Analysis, http://www.datavis.ca/gallery/guerry/.
References
Friendly, M. (2007). A.-M. Guerry's Moral Statistics of France: Challenges for Multivariable Spatial Analysis. Statistical Science, 22, 368-399.
See Also
Guerry for description of the analysis variables
Angeville for other analysis variables
Examples
library(sp)
data(gfrance)
names(gfrance) ## list @data variables
plot(gfrance) ## just show the map outline
# Show basic choropleth plots of some of the variables
spplot(gfrance, "Crime_pers")
# use something like Guerry's pallete, where dark = Worse
my.palette <- rev(RColorBrewer::brewer.pal(n = 9, name = "PuBu"))
spplot(gfrance, "Crime_pers", col.regions = my.palette, cuts = 8)
spplot(gfrance, "Crime_prop")
# Note that spplot assumes all variables are on the same scale for comparative plots
# transform variables to ranks (as Guerry did)
## Not run:
local({
gfrance$Crime_pers <- rank(gfrance$Crime_pers)
gfrance$Crime_prop <- rank(gfrance$Crime_prop)
gfrance$Literacy <- rank(gfrance$Literacy)
gfrance$Donations <- rank(gfrance$Donations)
gfrance$Infants <- rank(gfrance$Infants)
gfrance$Suicides <- rank(gfrance$Suicides)
spplot(gfrance, c("Crime_pers", "Crime_prop", "Literacy", "Donations", "Infants", "Suicides"),
layout=c(3,2), as.table=TRUE, main="Guerry's main moral variables")
})
## End(Not run)
Map of France in 1830 with the Guerry data, excluding Corsica
Description
gfrance85 is a SpatialPolygonsDataFrame object created with the
sp package, containing the polygon boundaries of the map of
France as it was in 1830, together with the Guerry
data frame. This version excludes Corsica, which is an outlier
both in the map and in many analyses.
Usage
data(gfrance85)
Format
The format is:
Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots: gfrance85@data,
gfrance85@polygons, gfrance85@plotOrder, gfrance85@bbox, gfrance85@proj4string.
See: SpatialPolygonsDataFrame for descriptions of some components.
The analysis variables are described in Guerry.
Details
In the present version, the PROJ4 projection is not specified.
Source
Friendly, M. (2007). Supplementary materials for Andr?-Michel Guerry's Moral Statistics of France: Challenges for Multivariate Spatial Analysis, http://datavis.ca/gallery/guerry/.
References
Dray, S. and Jombart, T. (2009). A Revisit Of Guerry's Data: Introducing Spatial Constraints In Multivariate Analysis. Unpublished manuscript.
Friendly, M. (2007). A.-M. Guerry's Moral Statistics of France: Challenges for Multivariable Spatial Analysis. Statistical Science, 22, 368-399.
Examples
data(gfrance85)
require(sp)
require(scales)
plot(gfrance85) # plot the empty outline map
# extract some useful components
df <- data.frame(gfrance85)[,7:12] # main moral variables
xy <- coordinates(gfrance85) # department centroids
dep.names <- data.frame(gfrance85)[,6]
region.names <- data.frame(gfrance85)[,5]
col.region <- colors()[c(149,254,468,552,26)] |>
scales::alpha(alpha = 0.2)
# plot the map showing regions by color with department labels
op <-par(mar=rep(0.1,4))
plot(gfrance85,col=col.region[region.names])
text(xy, labels=dep.names, cex=0.6)
par(op)
Distribution of crimes against persons at different ages
Description
This dataset comes from Plate IV, "Influence de l'age" of Guerry(1833), transcribed in Whitt & Reinking's (2002) translation as Table 9A, pp. 38-43. It gives the rank ordering of crimes against persons in seven age groups, in long form.
Usage
data("propensity")
Format
A data frame with 124 observations on the following 4 variables.
agea character vector, with 7 age groups,
<21,21-30,30-40...60-70,70-ranka numeric vector, rank of the crime within each age group
crimea character vector, label of the crime
sharea numeric vector, share (frequency) of the crime in a population of 1000
Details
For each age group (both males and females), the 17 most frequent crimes are listed in rank order, followed by an 'Other crime' category.
Source
H. P. Whitt and V. W. Reinking (2002). A Translation of Andr\'e-Michel Guerry's Essay on the Moral Statistics of France, Lewiston, N.Y.: Edwin Mellen Press, 2002.
References
Guerry, A.-M. (1833). Essai sur la statistique morale de la France Paris: Crochard.
Examples
data(propensity)
## maybe str(propensity) ; plot(propensity) ...