How do you combine two shapefiles

Join tables (QGIS3) ¶

Not all data sets that we want to use are available as spatial data. Often the data come as tables or worksheets. You have to combine it with the spatial data that is already available in order to use it for analyzes. This joining is known under Table join. It is carried out with the help of the algorithm.

Task overview¶

We will use a shapefile of census tracts for California and population data table from US Census Bureau to create a population density map for California.

Other skills we learn¶

  • Load CSV files without spatial information into QGIS

  • Use the DB administration to run SQL queries to generate summary statistics

Obtaining the data¶

The US Census Bureau offers TIGER / Line shape files. We can download a shape file of the sub-areas of the California population census from the FTP site. We are downloading the Census Tracts for California file.

American FactFinder is a directory of all US population census data. We use Advanced Search and look for Topic - Basic Count / Estimate and Geographies - All Census Tracts in Californiato create and download a customized CSV file. These instructions use the data.

For the sake of simplicity, we can download a copy of both data sets from the following links:

tl_2018_06_tract.zip

ACS_17_5YR_B01003.zip

Data source [TIGER] [USCENSUS]

Workflow¶

  1. Go to the file in the QGIS browser and expand it. Select the file and drag it to the workspace.

  1. We see that the layer has been loaded in the layer area. This layer contains the boundaries of the sub-areas of the California population census. Right click on the layer and choose Open Attribute Table.

  1. We look at the attributes of the layer. To associate a table with this layer, we need a unique common attribute for each feature. In this case the field contains a unique identifier for each sub-area. It can be used to join this layer to another layer or table that use the same ID.

  1. We unzip the file and open the CSV file in a text editor. We notice that each line contains information about a sub-area and the same unique ID as in the previous work step. We see that the ID is stored in the field in the CSV file. We also see that the column contains population numbers for all sub-areas.

  1. Before we import the CSV file, we need to make a small change. The QGIS CSV import expects the first line of a file to contain the column headers and all other lines to contain the relevant data. Line 2 of this file contains further column headers. We delete this line and save the file.

  1. Now we can import the CSV file into QGIS. Go to Layer ‣ Add Layer ‣ Import Text File as Layer….

  1. In the Data Source Management window click on the button ... and select the CSV file. Make sure to set the point at CSV (comma-separated values) under file format. Since we are importing a table with no geometric information, we need to select the No Geometry (Attribute Table Only) option. We make sure that the preview of the sample data is correct and click on Add and then click on Close.

  1. The CSV file is now imported into QGIS as a table and appears as in the layer area. Now we can connect to the table. Go to Processing ‣ Tool Box.

  1. First we have to change a setting of the toolbox. Click the Options button.

  1. Under Processing, check the box next to Use file names as layer names. This option makes the output layers of the processing tools easier and more intuitive to use. Click OK.

  1. In the processing tools, find the General Vectors Vektoren Attributes by Field Value algorithm and double-click it to open it.

  1. In the Attributes by Field Value dialog box, we select the entry as the input layer and the table column. We use the attribute as input layer 2 and as table field 2. We leave the other options unchanged and click on the button ... and then on to select the output file.

  1. We assign the name of the GeoPackage and the name of the output layer. Click on Start.

  1. After the processing is finished, we make sure that the algorithm has been processed successfully and click on Close.

  1. We can now see a new layer in the Layer area. The fields of the CSV file are linked to the layer of the sub-areas of the population census. We can now close the processing tools. Right click on the layer and choose Open Attribute Table.

  1. We see additional fields including the field that contains the population estimate.

  1. Now that we have the population data in the census tracts layer, we can style it to create a visualization of population density distribution. Select the layer and click the Open the Layer Styling Panel button.

  1. In the Layer Styling panel, select from the drop-down menu. As we are looking to create a population density map, we want to assign different color to each census tract feature based on the population density. We have the population in the HD01_VD01 field, but we don’t have population density in any fields to select as the value. Fortunately, QGIS allows us to input an expression here. Click Expression button.

comment

When creating a thematic (choropleth) map such as this, it is important to normalize the values ​​you are mapping. Mapping total counts per polygon is not correct. It is important to normalize the values ​​dividing by the area. If you are displaying totals such as crime, you can normalize them by dividing by total population, thus mapping crime rate and not crime. Learn more

  1. Enter the following expression to calculate the population density. calculates the area of ​​the feature in square meters. We then convert it to square miles and calculate the population density with the formula Population / Area. Click OK.

"HD01_VD01" / (0.386 * $ area / 1e6)
  1. Back in the Layer Styling Panel, choose a color ramp of your choice and click Classify. You can adjust the class ranges to be more appropriate to the region.

  1. The visualization feels a bit cluttered because of the polygon borders. Click on the dropdown next to Icon. Select Simple fill and check Transparent stroke.

  1. Now we have a nice looking information visualization of population density in California.