Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geoJson data for umbrella projects and their subsidiaries #27

Open
cc50liu opened this issue Apr 9, 2023 · 1 comment
Open

geoJson data for umbrella projects and their subsidiaries #27

cc50liu opened this issue Apr 9, 2023 · 1 comment

Comments

@cc50liu
Copy link

cc50liu commented Apr 9, 2023

Is there a data source that contains a structured representation of the one-to-many relationship between umbrella projects and their subsidiary projects?

The AidData WorldBank dataset handles one-to-many relationship situations by providing separate files; for example, the locations.csv file can have multiple locations for each project record found in the projects.csv file. Is there something comparable to that in the China dataset where I can get a list of all the subsidiary project ids below an umbrella project?

I'm asking because I notice that some umbrella projects have geoJson data associated with them, and I would like to confirm that their sub-projects also have specific geoJson data before I exclude the umbrella projects from my analysis. Here are some project ids of umbrella projects that have geoJson data:
41552
53002
30171
53184
56910
1713
1557
Is reading the text descriptions the only way to identify their subsidiary projects so I can verify the subsidiaries also have geoJson data? Or, is there some structured data available that would facilitate this? Many thanks.

@sgoodm
Copy link
Member

sgoodm commented Apr 11, 2023

@cc50liu, umbrella projects and related projects are going to have a fairly different relationship than the project/location relationship of previously geocoded datasets such as the World Bank one you mentioned.

The short version is that umbrella projects should be dropped from most uses of the geospatial features, and any non-umbrella projects referenced by an umbrella project should be treated as individual unrelated projects.

The long version:

Umbrella projects are provided for additional context on agreements and financial arrangements that are in many practical ways separate from actual project activities. The two examples of umbrella projects are:

  1. New financial agreements not yet tied to a specific project. E.g., China agrees to loan a country $10 million for improving transportation. Then, there could be two non-umbrella projects at a later data for "developing a railway from A to B" and "improving roads from C to D". Each of these would having the own funding amounts, geospatial features, etc. Critically, the geospatial feature of the umbrella project does not necessarily reflect the geospatial features of the implemented project.
  2. Debt relief tied to a previous financial agreement project. E,g, China agrees to forgive the debt from a loan to build a railway from A to B. Similar to the first case, the actual project was the railway from A to B, not the relief of that debt.

And the actual documentation describing umbrella projects for more context:

A record in the dataset is identified as an “umbrella” project in two circumstances. The first circumstance is when a financial agreement was signed by at least one party in the donor/creditor country and one party in the receiving country, but funds were not allocated for a specific project/purpose (or set of projects/purposes) until a subsequent date. These types of umbrella agreements include Economic and Technical Cooperation Agreements (ECTA) issued by China’s Ministry of Commerce (MOFCOM), master facility agreements issued by China Eximbank, lines of credit issued by China Development Bank, and Framework Agreements issued by a variety of official sector institutions in China. Due to the nature of the TUFF data collection process, the subsidiary projects/transactions approved and financed under these types of umbrella agreements are likely captured elsewhere in the dataset. The second circumstance is when a debt forgiveness project could involve loans captured elsewhere in the 2.0 dataset. If a debt forgiveness project involves loans contracted before 2000 (when the 2.0 dataset begins its coverage of Chinese ODA and OOF), then the debt forgiveness project will not be marked as an umbrella. However, if the debt forgiveness project involves a loan contracted during the 2000-2017 period, or it is unclear when the original loan was contracted, then the project is designated as an umbrella project to avoid double counting. Umbrella projects are included in the 2.0 dataset to clarify linkages between projects and to capture relevant activities without double-counting financial amounts or project counts. As a general rule, no umbrella records should be included in financial analysis or analysis of project counts as doing so will almost certainly result in double-counting.

The effective duplication of some project information across the umbrella projects and connected non-umbrella projects is a core reason for distinguishing between the two. If you aggregate the total commitment value of a set of connect umbrella and non-umbrella projects you would potentially be doubling the actual value of the projects.

As a side note, given that umbrella project geospatial features are typically not useful and potentially even inaccurate (relative to what ends up being actually implemented) we will likely revisit whether umbrella projects should be included at all in the geospatial dataset.

The last potential scenario I will mention is if you are linking your analysis of the geospatial features to broader policy/etc analysis in which, for example, you might need to consider the relationship or timing of umbrella projects (initial agreements), actual project implementation, and/or other umbrella projects (debt relief). There is no formal lookup for linking umbrella projects to non-umbrella projects, but some of our team has written some code to parse these relationships out for their own analysis (though it may not cover all umbrella projects). If you end up needing to head down this road, I can put you in touch with my colleagues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants