BeZero Household Devices methodology response
We thank the Project Developer Forum for taking the time and care to read through our Household Devices methodology. All feedback is welcome and appreciated, and we hope to maintain an open dialogue should any comments require further response.
We would also like to make it explicitly clear that this methodology provides both project-specific and broader, general examples as case studies where relevant to illustrate either the edge case applications of our ratings methodology for the sector or broader trends within the market. It serves as a detailed methodology document which sets the benchmark for others in the market and guides users through the application of the BeZero Carbon analytical considerations. It is by no means an exhaustive document that details the specificities of each rated project in our portfolio, where our analysis always includes project-specific context. As such, it is possible for each risk level assessment to encompass a range of project quality, as we steer clear of binary assessments where possible.
All comments provided by the PD Forum have been addressed below.
(1) According to your rating eligibility (page 20), BZC will only rate a project if they have public access to the crediting calculations which makes sense. However, VCS projects are not required to share their ER calculation sheets publicly on the platform. Similarly, for older projects/verifications the ER calculation sheet has not always been made publicly available in the GS registry. This may hinder the rating of these projects? A suggestion would be contacting the PD directly to ask for these calculations and any other information that can help inform the overall rating of the project?
In order to satisfy our primary eligibility criteria, a project must provide sufficient project claims publicly, including crediting calculations. At a minimum, this includes the following data for each vintage: baseline emissions, project emissions, leakage emissions, buffer deductions, and total emissions reductions. This data is fundamental to understanding the building blocks of credit issuance and is generally a basic disclosure requirement in project reporting under standards bodies.
Emissions reductions calculation sheets do provide a more granular breakdown of project accounting and are welcomed and encouraged by BeZero. However, if the above data is otherwise made available, then these sheets are not essential to qualify for a BeZero Rating.
In instances where a project is missing essential information, we actively engage with developers with the aim of making such information publicly available and bringing a greater level of transparency within the market. We acknowledge that public provision of these data are not a requirement for accreditation, but often entities are happy to facilitate their publication. To this end, our project developer engagement team has spoken with 172 project developers about their ratings and successfully encouraged additional disclosure in 50+ cases.
(2) There is a misleading statement regarding additionality demonstration for micro-scale VPAs using a positive list - Box 4 (page 24) and Figure 9 (page 25). While it is true that the rules for demonstrating additionality can differ between micro-, small-, and large-scale projects, there are some cases where the rules are exactly the same. For instance, in the case of GS, it is possible to use “deemed additionality” for additionality demonstration, which is not limited only to micro-scale projects but is applicable for all projects (regardless of scale) located in LDC, SIDS, and LLDC countries.
In Box 4, we note that the size of a project ‘can’ be a determining factor in additionality requirements, but it does not state that this is the only factor. This case study instead serves to illustrate just one example, rather than being exhaustive, and encompasses only one component of our additionality analysis. When rating a project, regardless of the additionality tests or qualifiers that they are subject to, we aim to be exhaustive in our approach and will assess all relevant parameters to ensure a fungible assessment. We also list several alternative qualifiers for automatic additionality, including project location, in the preceding paragraph. This explicitly references projects located in LDC, SIDS, and LLDC countries, so is in alignment with your comment.
(3) In the over-crediting section (page 48), BZC states that they prefer the use of the WISDOM model when calculating the fNRB value for a project. Will BZC just compare the project fNRB value to that of the WISDOM model, or will they also be willing to ask for and assess the fNRB calculations/report used by the PD if the value was calculated independently? Additionally, household device projects have shorter crediting periods compared to NBS projects for example. During the crediting period renewal, certain parameters are re-evaluated and updated. For instance, GS mandates reassessment of fNRB based on the most recent information available at the time of renewal, which occurs every five years.
BeZero incorporates all publicly available information relevant to each project into our ratings. This includes an examination of project and/or independently calculated fNRB values and the assumptions that go into them. Our team examines the quality of data sources used, and the inputs and assumptions a project uses in calculating their fNRB to assess whether the claimed value is appropriate and conservative. Where additional, independent assessments have been made to derive fNRB values, we include these and engage with developers to seek this data. In practice, we find that only 20% of the cookstove projects that we have rated publicly note independant fNRB assessments yet none actually provide these. We have devised data disclosure guidelines for developers to address transparency concerns, and our ratings are dynamic. Therefore, where ratings have been assigned, we will continue to monitor and assess any new information that is made publically available.
The WISDOM and MoFUSS models are used as a point of comparison alongside other data from literature and industry datasets to project-reported fNRB values as these currently represent the most accurate peer-reviewed sources of data on fNRB in regards to household devices projects, in our view.
Our rating is also dynamic and vintage specific. This means that if any project parameters are updated, this is taken into account for the relevant vintages to which the value is applied. For example, we find that across the 30 projects that use fNRB, as of October 2023, only 17% have updated the value. Of these projects, the updates resulted in fNRB both increasing and decreasing across vintages and this factors into our vintage-level assessments.
(4) The WISDOM model also contains significant uncertainties (such as using a dataset from 2009 and making assumptions regarding travel time to accessible biomass). Applying any default value for a specific jurisdiction fails to account for project-specific criteria, such as the accessibility and usage of renewable fuels in remote communities. It also fails to acknowledge the difference between project types, for example a charcoal project relies on a completely different fuel source to a wood stove project (one sources fuel from a ‘market’, one via local collection), which can significantly impact the fNRB calculation. This is why fNRB calculations at the project level are still an appropriate approach; which could be tempered by a justification for why the value is at variance to WISDOM etc. CO2 emissions reduction calculations should prioritize accuracy over conservativeness, and localized fNRB studies, particularly for rural woodfuel projects, can still be the most appropriate. We would propose a more flexible approach for such projects, that is not automatically penalized by ratings agencies for being at variance to modelled defaults.
As mentioned above, in our view, the WISDOM and MoFuSS models represent the most comprehensive peer-reviewed source of data for global fNRB values to date. We agree that accuracy should be prioritised, but fNRB is a parameter laced with uncertainty such that this may not always be possible. As part of our over-crediting assessments, we always take into account uncertainties regarding both the models applied by a project and those that we benchmark against to interrogate the range of estimates possible. Further to this, we consider updates to benchmark models over time and the release of new research as part of our continuous monitoring process.
For example, on the WISDOM model specifically, we view associated uncertainties to be significantly reduced in comparison to available methods under TOOL30. This is further evidenced by the UNFCCC’s recent information note which outlines the proposal to develop new default values of fNRB using the MoFuSS model (based on the same concepts of the WISDOM model). These values will be derived from more recent datasets and should therefore counter concerns surrounding outdated values.
To reiterate, BeZero’s analysis also incorporates bottom-up sources of information such as local studies and project-reported information. When this information is unavailable or projects have instead relied on now-outdated default fNRB values, we view the WISDOM and MoFuSS models to be a more reliable point of reference.
(5) In Box 22 on page 52, an ICS project in Malawi is discussed and it is said that they used a WBT to determine a “lower thermal efficiency of 25.6%” and also mentions using a KPT to determine efficiency. You are mixing these approaches up: WBTs are used to determine thermal efficiency of the stove and KPT’s determine the fuel efficiency, as stated earlier on in the “Thermal Efficiency” section. Box 22 is also confusing, it states that WBTs “can be driven by myriad factors such as geography, climate, and cooking practices” and then argues that there is uncertainty surrounding assumptions because there are variable results. Which one is it? It is either variable owing to project-specific factors, or consistent across the board (your measure of ‘certainty’). It can’t be both. It is widely acknowledged that KPTs provide the most accurate picture of actual fuel usage at the household level in both the pre- and post-project scenarios, and therefore mitigate the variables and uncertainties you identify in WBTs.
BeZero thanks you for bringing this to our attention. In regards to the case study of the thermal efficiency of a stove in Malawi, the thermal efficiency from the Kitchen Performance Test (KPT) was estimated using the reported efficiency improvements (55% more efficient) as compared to the three-stone fire (project-assumed 10% thermal efficiency). BeZero will include clarifications to this section to make clearer how such values were derived. BeZero is in agreement that KPTs provide a more accurate picture of actual fuel usage at the household level, yet this is dependent on the sampling technique. In fact, this view is noted on page 51 of the Household Devices Methodology. Nonetheless, we still find some risk associated with KPTs largely as a result of social biases. There are also uncertainties stemming from household heterogeneity and myriad environmental and socio-economic factors that may not be sufficiently captured in sample groups.
(6) In Box 24 on page 55, the HAP performance of stoves is discussed.
a. We are not sure how this links to a rating for GHG emissions reductions? If you are rating for health impacts, then this could/should be considered, but this is a GHG emissions rating. If you include HAP, then this should positively impact your rating as well as negatively, as appropriate.
b. Furthermore, the analysis does not cover all stove models (understandably so) but most of the devices used in projects are tested for emissions, thermal efficiencies etc. by an independent 3rd party already. The PD can be asked to share these test reports with BZC when the rating process begins which can be taken into consideration when assessing this parameter.
c. You say data comes from “International Workshop Agreement” with no reference; where is this data from?
d. You go on to say “We find that some ICSs have a low performance when reducing HAP, which may suggest an overestimation of actual reductions.” How do you correlate performance on HAP with emissions reductions? Of course the higher the efficiency of the stove, the higher the HAP reduction, but where does this fit into the rating of ICS projects? Unless a project is seeking to credit ‘black carbon’ as an emissions reduction, which most are not, then this is not relevant.
e. Where is your evidence that suggests: “HAP may increase even when using ICS”?
We should point out that, to date, black carbon emissions have not influenced our ratings.
However, if black carbon emission reductions are claimed by a project, we find it useful to compare the monitoring processes associated with both HAP and GHG to gauge potential implications to over-crediting. We find that some projects that do measure HAP use similar methods to GHG monitoring, as noted in our black carbon insight. Nonetheless, as stated in the feedback, most projects do not claim for ‘black carbon’ as an emission reduction.
Further, the data used to display varying ICS parameters is from the International Workshop Agreement (IWA), stated on the Clean Cooking Alliance cookstove catalogue. Here, several different ICS have been tested in accordance with IWA protocols to establish the ‘tier’ of each parameter. In addition, we consider the transparency of all documentation to be key for the integrity of carbon credits, and, as such, we are of the view that this information should be in the public domain in any case.
(7) In the “sample size” section on page 55, BZC mentions “an appropriate sample size and a representative population” is suggested for more efficient monitoring and projects using the minimum sample size are criticized. However, if statistical precision and confidence levels are achieved with the respective sample outlined in a methodology, then the sample is representative of the entire population regardless of the sample size. This is applicable in all scientific studies, not just for carbon projects, so what is your objection here?
Key to accurately estimating usage across project participants is selecting an appropriate sample size and a representative population. To determine the number of households that a project should sample, methodologies often adopt the same general approach as outlined in the CDM’s Sampling and surveys for CDM project activities and programme of activities, with the exception of VMR0006 which also allows a more simplified approach.
The most basic version of the CDM’s approach uses a simple random sampling method and assumes a homogeneous population. When collecting annual samples, sample size is determined using a 90% confidence interval and 10% margin of error. If biennial surveys are conducted, a 95% confidence interval and 10% margin of error is required.
We find no justification in methodologies for the use of a 90% confidence interval. Social sciences and environmental studies tend to adopt a minimum 95% confidence interval, as this necessarily increases precision and reduces error propagation. Adopting a 90% significance level is generally only seen as appropriate for sampling small populations with minimal variance, as it otherwise allows for a greater scope of inaccuracy. In reality, we find that the average number of cookstoves distributed by projects that BeZero has rated is about 120,000, suggesting that they are operating across large populations and thus subject to greater levels of sampling uncertainty.
In addition to sample size, sample population characteristics are also essential to accurate monitoring. In order for surveys to generate reliable information, it is important that the sample population is representative of the entire target population, in this case end-users of ICS.
We find that many ICS projects operate at a national scale, with stoves distributed to multiple demographics across the country. For BeZero-rated projects, we find around 51% of projects have a national boundary. Even in those cases where projects operate at a more local level, we find that there can still be distinct populations of end users with different traits that may affect cooking practices. It is therefore common for stoves to be distributed to heterogeneous populations and in such instances, it is best practice to adopt a stratified approach and sample multiple population groups that represent different characteristics. Despite this, we find that the majority of projects that BeZero has rated have not sufficiently stratified and may not capture differences in regional and household conditions.
(8) Your argument on stove usage is extremely weak and demonstrates a fundamental lack of understanding of project-specific conditions. Where independent auditors have conducted site visits to assess usage rates there is considerably more assurance around usage data than using non-project-specific literature. You fail to acknowledge that usage can and does increase over time as developers employ outreach programmes to encourage usage – indeed developers are incentivized to do this.
We acknowledge that there are several bottom-up factors that influence usage rates, as evidenced by the analysis in our ratings on the BeZero Carbon Markets platform and the Guatemala case study example in the methodology (Box 28, page 62). The methodology highlights a simple snapshot of analysis, ranging from methodological-specific parameters to sector-level assessments, in addition to local, regional, and national conditions, which we find to be consistent with peer-reviewed literature and our analysis on the Household Devices sector group. There are several components underlying usage which are assessed for each project we rate, including, but not limited to, follow-up support, availability of maintenance, training and education initiatives undertaken by the project, sample size, sample stratification, frequency and seasonality, survey methods, and recall and social biases.
To be clear, a high or increasing usage rate is not an inherently bad thing - on the contrary, high usage rates are an indication of a successful project where the ICS has adequately met the needs of the end user. In fact, we find examples of such projects within our rated portfolio of Household Devices projects where usage rates may be suitable. For example, energy efficient heating devices in Mongolia have usage rates of between 50-90%. We find that these samples have been stratified according to location and settlement type, which is likely to be more representative of household conditions, whilst the project has also implemented several marketing and follow up strategies. However, usage rates are highly context dependent. For example, evidence suggests that in donor-led programmes (where an ICS is often distributed for free), usage rates tend to be lower because end-users do not value the stove as much and tend to have fewer resources for support and maintenance. In contrast, projects that may utilise carbon finance to provide education, training, and follow-up support may have much higher usage rates. In addition, it is imperative to assess exactly how a project’s usage rate has been determined and what population has been sampled, as there are sources of uncertainty in different approaches. These bottom-up factors are an essential part of our analysis of the over-crediting risk associated with usage rates.
(9) Your section on Stove Stacking fails to identify how paired KPTs negate this risk by isolating the fuel use in question (i.e. biomass) and comparing the pre- and post-project scenarios. If other stoves exist and if they are being used it will be demonstrable in the baseline and project KPT data. Equally where baseline stoves continue to be used in the project scenario, this is always captured in KPT/habit survey data. Referring to published literature is considerably less accurate than reviewing independently verified, project-level data. Therefore, KPT results and other monitoring results such as the usage rate and stove stacking should all be considered when analysing the project as a whole.
BeZero acknowledges that there are several ways in which projects may monitor for stove stacking, such as KPTs or surveys.. BeZero’s assessment of stove stacking rates take into account the robustness of sampling (e.g., paired KPT versus survey, sample size, stratification, frequency, seasonality, etc.). We also leverage peer-reviewed literature to inform where risk of stove stacking may arise or be more prevalent, whether due to stove design, geography, cultural preferences, etc. As discussed above, methods to determine stove stacking (as well as usage) are all key inputs to our bottom-up analysis of these rates. In fact, our ratings reports cover this analysis in detail and are provided to all project developers as part of our developer engagement process where project-specific analysis can be interrogated.
To illustrate this, we use an example from an energy efficient project in Ghana. Our analysis of peer-reviewed literature indicates that around 50% of households in Ghana use more than one device for cooking. However, in the first three monitoring periods of this particular project, 97% of end-users kept the baseline stove, and the project made no deductions for this. In the fourth monitoring period, the project did apply a 13% deduction, although this is still considerably lower than the amount of households that kept the baseline stove. As such, uncertainties remain, creating scope for greater risk within our over-crediting assessment. We use peer-reviewed literature to aid our understanding of cooking conditions which enables us to form a balanced opinion(s).
(11) The acronym “NBS” is only explained as being “nature-based solutions” on page 71 but is used as an acronym earlier on in the methodology and is not on the acronym list.
BeZero thanks you for bringing this to our attention and will add it to the acronym list for clarity.
(12) You discuss leakage from ‘other cookstove uses’:
a. This fails to recognize that KPTs assess any other usage as they measure fuel use at the
household level, not just on the ICS. Applying KPTs negates any leakage from other cookstove usage.
b. This also applies to the “Jevons paradox” which is equally covered by KPT assessments at the household level.
c. Additionally, projects can evaluate potential leakage sources during the baseline survey (e.g., if the baseline stoves are used for heating) or monitoring survey/s. The results can be used to justify the default leakage discount as a conservative approach.
Our approach to leakage considers regional, sector, and method-specific contexts and compares project activities to these. Firstly, through our cross-sectoral methodology that enables fungible comparisons, BeZero acknowledges that projects in the Household Devices sector group tend to have limited leakage risk in comparison to other sectors such as those in the Nature-Based Solutions sector group. Secondly, our bottom-up approach does assess how a project may monitor for leakage (including the use of a KPT) and how robust such approaches are. The risks identified in the methodology are all the possible leakage risks associated with Household Devices projects which we may assess, but may not be applicable to all project types depending on their project-specific and leakage monitoring.
(13) With regards to “Reversal Risk” on page 77, household device methodologies do not account for carbon stocks in forests. Instead, they focus on reducing emissions from biomass combustion. The emissions reduction credits (ERs) generated during fuel combustion are not reversible. The potential impact of “Members of the population who do not participate in the project, and previously used lower emitting energy sources, instead use the nonrenewable biomass or fossil fuels saved under the project activity” is accounted for within the project leakage assessment.
We thank you for your comment but do not agree with this statement, as the fNRB corresponds to the proportion of biomass that remains standing as a result of project activities. Therefore, if the carbon from this saved biomass were to be released as a result of some other occurrence, such as a fire, savings from the project would essentially be reversed.
While we acknowledge that accounting for reversal risk for Household Devices projects is a highly nuanced process, there is still some risk due to the carbon stocks saved by the project activity. Emission reductions account for saved carbon stocks from non-renewable sources, and while those exact sources may not be identified, the risk of reversal is not zero. As such, whilst we acknowledge the potential for reversal risks in this sector group, without data regarding the location of household-level fuelwood collection, both pre and post project, it is not possible to monitor the permanence of ‘saved’ carbon stocks.
Nevertheless, due to the cross-sector fungibility between our ratings, we find that in the Household Devices sector, reversal risk is likely to be low in comparison to other sectors such as Nature-Based Solutions projects.
(14) The section about end-user household locations on page 77 states that it's rare for projects to disclose the exact locations of households. This information is, however, available for many of the newer projects which collect the GPS location of each monitored end-user and also include GPS locations in their end-user databases. It is to be noted that the GPS locations and the full name of the end-users might not be disclosed within the publicly available documents for privacy issues but are always shared with the independent auditors.
We welcome projects that use GPS to help decipher the location of each project stove. Along with other methods used by newer projects, such as using metered devices or stove use monitors, it enables carbon calculations to become more reliable based on the increased accuracy of accounting parameters.
BeZero acknowledges that some projects do disclose the locations of end-users, even though it is not required to make this publicly available by the standards bodies. We find that around 36% of BeZero-rated projects in the sector use GPS for the location of project households, according to project documents. However, only around 21% of these projects provide information on household location at a district level. To provide transparency, projects could release publically available sales databases, as one project in Uganda (GS10967) has done. Here, the district where each end-user resides is displayed, which, paired with our own analysis of settlement types in Ugandan districts, can be used for our assessment of additionality. This information contributes to our view that only limited risk to additionality exists, amongst other factors. Overall, we welcome strides to overcome these disclosure challenges to the Household Devices sector, but nonetheless, information that is not publicly available is not considered in the rating.
(15) With regards to “Sustainable Development Goals” section on page 83, there has been standardization work done for monitoring, reporting and verification of the SDG’s (other than SDG13). For example, the GS SDG tool which has been available since 2021.
We recognise that some standards bodies have developed standardised processes for projects to make SDG claims. Such systems include Gold Standard’s Gold Standard for the Global Goals (GS4GG) and Verra’s Sustainable Development Impact Standard Program (SD VISta).
However, there is a lack of standardisation of SDG claim MRV requirements across all standards bodies’ SDG claim systems. GS4GG and SD VISta for example require the monitoring of SDG claims. But, projects registered with CAR, ACR, or Plan Vivo are not required to monitor their SDG claims. This lack of standardisation across SDG claim requirements in the VCM leads to a wide range of SDG impact data available.
Projects vary from reporting no information beyond the SDG claimed to providing baseline, ex ante, ex post, and monitoring SDG impact data. But, SDG claims are demonstrated as SDG symbols on project registry pages, regardless of the quantity and robustness of SDG claim impact data provided. This lack of transparency makes it difficult for buyers to compare the robustness of SDG claims across projects.
(16) With regards to the “Risk Factor Weighting” on page 85, some more clarity on the rating and how it relates to the final “score” would be useful. How does the rating scale relate to the risk factor weighting scale and how exactly are each of the sectors scored?
The BeZero Carbon Rating is a fungible rating across all sectors, and therefore aligned on the same six risk factors and eight-point ratings scale.
We make a preliminary view of carbon efficacy risk based on three core components ordered by their relative importance in determining credit quality: Additionality, Carbon Accounting, and Non-Permanence. While all core components are important drivers of carbon efficacy, their relative role is subjective to the materiality of individual risks. Note that carbon accounting includes an assessment of both over-crediting and leakage.
It should be noted that assigning the rating is a deeply analytical process, wherein the sole objective is to assign ratings reflective of the carbon credit’s efficacy and quality. In exigent circumstances where a specific risk factor is considered to have an overbearing impact on the overall rating, the rating can be constrained by said factor. This is applied asymmetrically, i.e. there is only a downside if a risk factor is deemed especially significant, it cannot have a positive mitigating effect on the overall rating.