Introduction

Based on a report published by McKinsey Global Institute (2011), “Big data” is defined as datasets with extensive size that offer more advanced ability than typical database software tools in the areas of capturing, storing, managing, and analyzing. Since big data has been adopted in every sector of the global economy, it creates enormous value for motivating economic development and becomes a main driver for sustainable development (Manyika et al. 2011; Nara et al. 2021). Although previous research stated the positive impact of big data on economic, environmental, and social welfare (Katz and Koutroumpis, 2013; Khan et al. 2015), scholars have mainly focused on how digitalization can contribute to economic growth (Myovella et al. 2020; Habibi and Zabardast, 2020). Digital economy as a core sector of big data is considered a key driver of encouraging sustainable development from an economic and environmental point of view (Goldfarb et al. 2015; Li et al. 2020). Through the dissemination of knowledge and the deep integration of technology, digitalization not only promotes industrial upgrading and transformation but also accelerates the shift of manufacturing industries toward service-oriented models, thereby laying a solid foundation for a low-carbon economy (Jetter et al. 2009; Paschou et al. 2020).

Among all the sources of greenhouse gas (GHG) emissions, human activities are identified as the primary drivers of global warming (Lashof and Ahuja, 1990). Although historical emissions of carbon dioxide (CO2) have brought economic wealth to the emitting countries, this has led to a dramatic increase in CO2 levels at a rate unprecedented in at least the last five hundred thousand years (Rickels et al. 2023; Friedlingstein and Solomon, 2005). Thus, a new wave of industrial revolution—the Digital Revolution—is proceeding at an astonishing pace (Hodson, 2018). As the core productive element of the digital revolution, big data has driven the emerging of the Digital Economy (Goldfarb et al. 2015; Manyika et al. 2011). Globally, the utilization of big data for the promotion of economic development, improvement of social governance, and enhancement of governments’ service and regulatory capacities is becoming a trend, and relevant developed countries have developed documents on the implementation of big data strategies to vigorously promote the development and application of big data, such as China’s “Action Outline for Promoting the Development of Big Data (Guofa [2015] N50)Footnote 1 (henceforth Action),” the European Union’s “Industry 5.0Footnote 2,” and the United States’ “Federal Big Data Research and Development Strategic PlanFootnote 3,” these policies have already promoted the digital and green transformation of economies. Consequently, digitalization is regarded as a vital pathway for countries to achieve sustainable development (Mehmood et al. 2023).

Considering this, digital economic policies are actively promoted by governments worldwide, which accelerates the transformation of integrated and efficient production networks. The implementation of these policies could densify the production network through the channels of encouraging technology innovation and reducing economic distortions. This kind of transformation contributes to enhancing overall production efficiency and fostering sustained economic growth (Acemoglu and Azar, 2020). Moreover, combining targeted environmental protection measures with digital economic policies can ensure both economic development and environmental sustainability (Acemoglu et al. 2012; Karlilar et al. 2023).

This study is conducted in the context of China’s “National Big Data Comprehensive Pilot Zones” (henceforth NBDCPZ) policy. On one hand, China is the world’s largest emitter of carbon dioxide and faces significant pressure to reduce emissions. The country has committed to achieving carbon peaking by 2030 and carbon neutrality by 2060 (Wang et al. 2022; Zhao et al. 2022), which underlines the urgency and importance of addressing carbon reduction challenges. On the other hand, the Global Digital Economy White Paper (2020)Footnote 4 reports that the digital economy value-added of 47 major countries reached $31.8 trillion in 2019. Among these, China accounted for $5.2 trillion, ranking second only to the United States ($13.1 trillion), while maintaining the fastest growth rate at 15.6% globally and treated China as one of the leading economies in digital transformation. Thus, motivated by existing conditions in China, this paper documents the relationship between policies of big data and the digital economy and carbon emissions. The NBDCPZ policy offers a unique framework for this study, with clearly defined geographic and temporal boundaries that allow for a quasi-natural experimental design. This approach enables the identification of causal mechanisms between digitalization and carbon emissions, providing insights that are both regionally specific and globally relevant.

Although digital technology improvement will lead to an increase in energy use efficiency, it can also increase energy consumption (Huang et al. 2024; Wang et al. 2021; Kouton, 2019). On the one hand, industries generated based on digitalization are considered to have low energy density and incremental energy consumption (Romm, 2002). Meanwhile, digital empowerment can promote the sustainable development of industries such as agriculture (Basso and Antle, 2020), industry (Prajogo and Olhager, 2012; Hao et al. 2023), and finance (Hong and Xiao, 2024; Marszk and Lechman, 2021). It promotes economic growth and effectively reduces carbon emissions through the channels of digitalization direct effects, technological effects, efficiency effects, and structural effects (Wang et al. 2023; Wei et al. 2023; Wang et al. 2022). However, we notice that some existing literatures indicate that digitalization may have a rebound effect, Lange et al. (2020) concern that digital development fosters new energy-intensive industries, thus, carbon emissions may be generated during the producing, installing, distributing, and upgrading of the digital infrastructure, which results to adverse environmental consequences (Kunkel and Tyfield, 2021). Digitization has improved energy efficiency, but the reduction in costs has led to increased demand, causing the rate of energy consumption to rise rather than decrease. (Kouton, 2019). Moreover, digitalization requires digital infrastructure such as data centers and cloud computing as foundations, whose construction and operation consume a large amount of electricity. Given the current energy supply structure, most electricity is still derived from fossil fuels, particularly coal (Sikarwar et al. 2021), leading to a significant increase in carbon emissions (Danish et al. 2018; Avom et al. 2020; Khan et al. 2022).

As the impact of big data on carbon reduction is becoming more important, the number of studies on big data and low-carbon economy is increasing. Wei et al. (2023) and Hu (2023) examine the environmental effect of Big Data policy, especially how the implementation of China’s “National Big Data Comprehensive Pilot Zone (henceforth NBDCPA)” affects carbon emissions. They found that NBCPLZ realized carbon reduction through technological innovation, industrial transformation, improved energy efficiency, and green total factor productivity. They also emphasize that there is a significant disparity in the carbon reduction effects of the NBCPLZ policy across different types of cities. Our findings are consistent with them and make an important contribution to the digitalization rebound effect on carbon reduction, which is less researched by the existing lectures. Kunkel and Tyfield (2021) argue that considering “sustainable” industrialization in the context of ongoing digitization indicates that, without coordinated interventions in research and policymaking regarding digitization and sustainability, digital rebound effects could be a default outcome. Our paper focuses on examining and answering this question: Is the digital rebound effect as severe as emphasized in the literature?

Our study contributes to the empirical NBDCPZ studies about carbon reduction pathways and optimization from the perspective of potential mechanism conflicts. Specifically, we document a quasi-natural experiment to analyze whether and to what extent this policy affects carbon emission reduction by using a difference-in-difference (DID) approach. Considering the dual effects of digitalization on carbon emission caused by digital infrastructure expansion and technology innovation, we also employ a causal path analysis (CPA) to study if the rebound effect of digital infrastructure will weaken the positive effects on carbon reduction.

We shed light on the digitalization rebound effect on carbon emission reduction that was previously less explored. The empirical findings confirm that NBDCPZ achieves carbon reduction through the channel of digital technology innovation, while the mechanism of digital infrastructure expansion will lead an increasing of carbon emission. However, the results of CPA indicate that when we test the combination effect of these two channels, the digitalization rebound effect disappears, and they will together lead to a decline in carbon emissions. Additionally, we utilize the Policy Learning model to optimize policy assignments and improve carbon reduction efficiency. We found that implementing Big Data Policy in less-developed regions can lead to a more efficient impact on carbon reduction. Using a sample of 282 cities in China from 2006 to 2019, we conduct empirical analyses and estimate the results through the average treatment effect of DID, with all results passing robustness checks.

Overall, this research makes a three-fold contribution to the literature. First, we make a first attempt to investigate how serious the digitalization rebound effect is by using the CPA approach. We show that although the digital infrastructure construction and development discouraged low-carbon growth when we combine the digital technology innovation effect with it, the overall effect of NBDCPZ is positive on carbon reduction. Second, we extended the research based on Wei et al. (2023) and Hu (2023) by optimizing the policy after evaluation. We are the first to apply a policy learning model to the optimization of the carbon reduction effects of big data policies. Our findings reveal differences in carbon emissions at the individual (city) level, showing that the direction and magnitude of the policy impact on carbon emissions vary across different entities. This helps us to optimize the policy allocation rules, further enhancing the carbon reduction efficiency of the policies, and provides more scientific decision-making recommendations for the next batch of Big Data Policy implementations. Third, our research holds significant policy implications for developing countries aiming for low-carbon economic transitions through big data policies. It offers new insights into addressing frontier issues related to global warming in the digital age.

The rest of this paper is organized as follows. Section “Policy background and hypothesis development” reviews related literature and proposes policy background. Testable hypotheses are developed based on the analytical framework. Section “Methods and data” presents data sources, descriptive statistics, and empirical model design. Section “Carbon reduction effect test and robustness check” analyzes empirical results and robust test. Section “Conclusion and policy implications” concludes the paper.

Policy background and hypothesis development

Researches on carbon emission reduction

Since 2005, China has surpassed the United States to be the world’s largest carbon emitter. The 35 major cities contribute approximately 40% of the country’s energy consumption and carbon emissions while accounting for only 18% of the national population (Dhakal, 2009). In response to the pressing challenge of climate change, China has pledged to achieve carbon peaking by 2030 and carbon neutrality by 2060, demonstrating its commitment to global sustainability efforts. This commitment has led to increasing academic attention on carbon reduction pathways, with researchers emphasizing the roles of the public, enterprises, and government in providing crucial support for theoretical innovation and practical applications in low-carbon development.

In the public sphere, awareness and participation are considered essential drivers of carbon reduction. Chen et al., (2023) show that high attention from public society on climate change is expressed through social media and environmental discourse, which leads the policymakers and industries to adopt stricter environmental management measures. For example, discussions on Weibo (a Chinese social media) reveal that public sentiment toward carbon neutrality not only reflects optimism about achieving sustainable development but also underscores a strong belief in collective efforts to meet these goals (Li et al. 2023). At the enterprise level, technological innovation and operational efficiency are the essential ways to achieve carbon reduction targets. Firms’ digital transformation not only reduces carbon emissions but also promotes the widespread adoption of sustainable practices through enhancing energy efficiency and fostering green innovation (Shang et al. 2023; Zhang et al. 2024). Moreover, ESG ratings serve as an important regulatory framework, aligning corporate actions with sustainability objectives. For instance, ESG initiatives improve corporate performance in carbon reduction by easing financing constraints and increasing transparency (Li and Xu, 2024).

Meanwhile, green investments led by the government are considered a cornerstone for driving low-carbon transitions. Research shows that investments in renewable energy and green technology significantly reduce carbon emissions by creating a sustainable financial ecosystem (Huang et al. 2021; Li et al. 2021). Additionally, governments can facilitate the transformation to a low-carbon economy by implementing carbon reduction policies and offering financial incentives, laying a solid foundation for achieving long-term environmental goals (Xuan et al. 2020; Zhou et al. 2024).

Policy background

The integration of information technologies with economy and society has led to a rapid growth of data. Data has become a basic strategic resource, and big data is increasingly having a significant impact on the economic operation mechanism, social lifestyle, and the state’s capacity for governance.

In 2015, the State Council issued the Action Outline for Promoting the Development of Big Data (Guofa [2015] N50 or No. 50 [2015] of the State Council)Footnote 5, hereinafter referred to as “Action Outline”, which established the strategic framework and guiding principles for the development of China’s big data. The main task is accelerating the opening and sharing of government data, promoting the integration of resources, and enhancing the governance capacity. In the field of environmental management, the openness of environmental data facilitates increased public participation. Moreover, establishing a state-level macro-control data system can provide scientific, forward-looking, and effective decision support for environmental governance. The Chinese government’s 13th Five-Year Plan (2016–2020) (13FYP)Footnote 6 further clarifies the strategic direction of big data and advocates for smart energy and green ecological development based on the “Internet +”. These policies demonstrate China’s efforts to utilize big data to promote environmental monitoring, green production, and ecological construction, thereby driving the trend of economic structural transformation.

NBDCPZ aims to promote the innovative development of big data in China. In 2016, the National Development and Reform Commission (NDRC), the Ministry of Industry and Information Technology (MIIT), and the Cyberspace Affairs Commission (CAC) agreed to the establishment of the National Big Data (Guizhou) Comprehensive Pilot Zone, which is the first BDCPZ of ChinaFootnote 7, aiming to promote the integration of regional big data infrastructure and the aggregation and application of data resources, thus leveraging its demonstrative role to drive development. Furthermore, in October of the same year, the central government set the second batch of pilot zones, which includes Beijing–Tianjin–Hebei Region, Pearl River Delta, and other regionsFootnote 8.The positioning and functions of these pilot zones are as follows: Guizhou serves as the leading pilot zone, focusing on the sharing and integration of data resources; Beijing-Tianjin-Hebei and the Pearl River Delta are cross-regional pilot zones, emphasizing the integration of data flows with technology, materials, capital, and talent flows, as well as the synergy of public services, social governance, and industrial transfer; Shanghai, Henan, Chongqing, and Shenyang are regional demonstration pilot zones, leading the balanced development of the four major regions nationwide, strengthening the clustering of the big data industry, and optimizing regional cooperation; Nei Mongol focuses on the coordinated development of infrastructure, focusing on resource integration and green development, and strengthening cooperation with major industrial and talent agglomeration areas.

In addition, local governments have issued a list of actions, such as the Implementation Plan for the Construction of the National Big Data Comprehensive Pilot Zone in the Pearl River Delta (Yuebanhan [2017] N184)Footnote 9 and Implementation Plan for the Construction of the National Big Data Comprehensive Pilot Zone in Henan Province (Yuzheng [2017] N11)Footnote 10, which advocates the application of big data in environmental governance, energy efficiency, and green-intensive development, as well as improves the accuracy of environmental law enforcement and the efficiency of resource management and further promotes the sustainable growth of a low-carbon economy. Plan for Development of the Digital Economy During the “14th Five-Year” Period (No. 29 [2021] of the State Council)Footnote 11 and 14th Five-Year Plan for Renewable Energy Development (Fagai [2022] N210)Footnote 12 further emphasize the transformation of the digital industry and low-carbon development, aiming to promote the innovation of green technology and the intelligent upgrade of the energy industry chain. Overall, “National Big Data Comprehensive Pilot Zone” is guiding China’s urban low-carbon transformation and accelerating the pace of sustainable development.

Research hypothesis

Digitization has a positive effect on carbon reduction. The core objective of establishing the National Big Data Comprehensive Pilot Zone is to explore the application mechanisms of big data. Firstly, local governments utilize big data to conduct detailed analyses of carbon footprints, enabling more precise identification of emission sources. By integrating data from various sectors such as transportation, energy production, and industries, big data technology can pinpoint the most effective emission reduction areas. Smart grid technology optimizes energy distribution through big data analysis, which can reduce unnecessary energy consumption and carbon emissions (Verbong et al. 2013). The predictive analytics function of big data aids in forecasting emission trends and assessing the potential impacts of various emission reduction strategies, enabling policymakers and businesses to take proactive measures and optimize resource allocation in emission reduction efforts (Huo et al. 2022). The deployment of big data tools enhances the transparency of carbon emission reporting. Accurate and real-time emission data ensures that companies and governments are accountable for their carbon footprints and encouraging more stringent emission reduction commitments (Zhang and Huo, 2023; Su et al. 2020). Secondly, the “National Big Data Comprehensive Pilot Zone” policy is a core driver of digital economy growth (Qiu and Zhou, 2021) and promotes environmental transformation through technological innovation and industrial diversification (Yi et al. 2022; Tan et al. 2023). The digital industry drives the transition to renewable energy, reduces energy consumption, promotes innovation in low-carbon technologies, and thus decreases carbon emissions (Razzaq et al. 2023). Finally, the integration of digital and traditional industries improves product structure and operational efficiency, facilitating the green development of traditional industries (Carrire-Swallow and Haksar, 2019; Liu et al. 2023). Thus, Hypothesis 1 is proposed.

Hypothesis 1. National Big Data Comprehensive Pilot Zone can lower carbon emissions and will positively impact low-carbon transformation in cities.

There are two main effects of digitization; the first one is the digital rebound effect, which negatively impacts carbon reduction. Although the Action Outline (2015) emphasizes the importance of establishing digital infrastructure as a key element in promoting big data system innovation, data sharing, and digital industrial cluster development, digital infrastructure development still adversely affects carbon emissions. Digitalization significantly enhances productivity and production output, but digital infrastructure, as high-energy-consuming equipment (Tang and Yang, 2023), relies heavily on coal, oil, and natural gas during its production, application, and disposal stages, leading to increased energy demand (Lan and Zhu, 2023), exacerbating environmental pollution, and potentially diminishing the positive benefits of digitalization in enhancing efficiency (Madlener et al. 2022; Liu et al. 2023). Furthermore, the construction and operation of data centers, as a central component of digital infrastructure, have been identified as significant energy consumers. For instance, computing infrastructure, including data centers and supercomputing centers, not only consumes vast amounts of electricity but also leads to a notable increase in carbon emissions, especially in regions where energy predominantly comes from fossil fuels (Mao et al. 2024). Similarly, the rapid proliferation of 5G base stations and other digital infrastructure elements has amplified energy demand, placing substantial pressure on local power grids and elevating carbon emissions, particularly in high-density urban areas (Che et al. 2024). Despite the potential for improved energy efficiency through advancements in digital technology, the initial phases of infrastructure expansion often result in an “energy trap” characterized by increased fossil fuel reliance and carbon output (Chen et al. 2024). Therefore, the carbon emissions generated during the construction, operation, and maintenance of digital infrastructure contribute to a long-term carbon emission lock-in effect, with this “inertia-driven carbon lock-in” phenomenon potentially becoming a primary challenge for Chinese cities in reducing carbon emissions (Müller et al. 2013; Zheng et al. 2018). Thus, Hypothesis 2 is proposed.

Hypothesis 2. (Infrastructure) National Big Data Comprehensive Pilot Zone may lead to the expansion of digital infrastructure, resulting in increased energy consumption and, consequently, higher carbon emissions.

The second one is the technological effect, which will positively affect carbon reduction. The impact of digital technology innovation is considered one of the methods for controlling carbon emissions (Zhu et al. 2022). Digital technology innovation refers to the process and outcomes of enterprises or organizations developing new products, processes, organizational structures, and business models based on digital technologies (Yoo et al. 2012; Nambisan et al. 2017). Specifically, information, computing, communication, and connectivity technologies, along with their integration—such as artificial intelligence, big data analytics, cloud computing, and blockchain technology—are all manifestations of digital technology (Vial, 2021). The diffusion of digital technology, green development strategies, and innovation-driven growth policies all exert powerful inhibitory effects on environmental pollution (Xu et al. 2022; Li et al. 2021; Wen et al. 2021). Meanwhile, recent studies have highlighted the multifaceted benefits of digital technology in reducing carbon emissions. For example, green technology innovation driven by digital technologies has been shown to optimize industrial structures and promote energy transition, significantly curbing carbon emissions (Huang et al. 2024). Additionally, the integration of digital technologies into supply chains enhances their carbon performance by enabling real-time monitoring and fostering collaborative efforts in emission reduction (Li et al. 2025). Furthermore, digital technologies, such as industrial robots and smart energy systems, enhance energy efficiency and enable precise energy management, reducing carbon emission intensity across urban and industrial sectors (Liu et al. 2024). Given the current climate, the circular economy has garnered high attention from global governments and enterprises, providing an excellent opportunity for digital technology to positively impact sustainable development (Jones and Wynn, 2021). Mainly through increasing the proportion of non-fossil energy utilization and optimizing industrial layout, digital technology innovation is expected to further reduce carbon emission intensity (Wang et al. 2021). For example, digital technology can be widely applied in traditional production activities through penetration effects. It can tightly link various production processes, and enhance their synergistic effects, thereby improving resource and energy efficiency and achieving efficient, green production activities (Park et al. 2018). In terms of government management, digital technology can make a positive contribution to climate change research by detecting new patterns in environmental data (Vinuesa et al. 2020). Thus, Hypothesis 3 is proposed.

Hypothesis 3. (Technology) National Big Data Comprehensive Pilot Zone may drive advancements in digital technology innovation, enhancing efficiency and contributing to carbon emission reductions.

Methods and data

Model setup

For conducting policy evaluations, the difference-in-difference (DID) method is widely used in establishing causal inference by comparing the changes in explanatory variables between treatment and control groups both before and after implementing the policy (Qian, 2008; Chen et al. 2021). Specifically, policymakers typically consider factors such as regional industrial structure and economic development level when they are selecting the pilot cities of the National Big Data Comprehensive Pilot Zone, which will result in systematic differences between pilot and non-pilot cities. Thus, it fails to meet the requirement of randomness and thereby interferes with the identification of causal inferences.

To identify the effect of the National Big Data Comprehensive Pilot Zone policy on carbon emissions, this paper adopts the DID method to examine the treatment effects of the policy and subsequently analyzes the impact of potential non-random factors in identification. Individual factors, time factors, and other variables are controlled in this model.

The specific model settings are shown in Eq. (1):

$${\rm{Carbo{n}}}_{it}=\alpha +\beta\,{\rm{BigDat{a}}}_{it}+\gamma {\rm{Contro{l}}}_{it}+{\mu }_{i}+{\lambda}_{t}+{\varepsilon}_{it}$$
(1)

where Carbonit denotes the total carbon emission in city i during year t; BigDatait is a dummy variable equals to 1 if city i is a pilot city at time t, coefficient β captures the treatment effects of pilot policies; and Controlit refers to a set of country-specific control variables. We also control for city and year-fixed effects, μi signifies the city fixed effect, λt signifies the year-fixed effect, εit implies the random error term.

Variables

Explained variables

Carbon emissions (total amount of carbon emissions), as the core element of China’s double carbon strategy, which includes combustion of fossil fuels, greenhouse gas emissions resulting from industrial production processes, land use, and forestry activities, as well as indirect emissions from purchased electricity and heat. This paper calculates carbon emissions based on the method of Wu and Guo (2016). See Appendix A.1 for a more detailed description. The above data has all been logarithmically transformed.

Core explanatory variable

The core explanatory variable is the policy dummy variable of the National Big Data Comprehensive Pilot Zone. Since there are significant differences in scale and economic levels among cities within the same province, we use the strategy of Wei et al. (2023), choosing Beijing, Tianjin, Zhangjiakou, Langfang, Chengde, Qinhuangdao, Shijiazhuang as the treatment group. (we can just list the name of the treatment and control group, the reason for doing so can be written in the notes.)

Except that the treatment period of Anshun City, Guizhou Province was set in 2015, the other treatment groups were consistent with the setting of Wei et al. (2023). Based on the processing group setting of the above national big data comprehensive test area, the difference-difference variables are constructed. Figure 1 shows the details of the treatment and control groups.

Fig. 1
figure 1

Schematic diagram of the NBDCPZ period.

Control variables

This paper refers to the literature of Xue and Chen (2022) and selects the following control variables to control other factors that impact carbon reduction: (1) The level of economic development (Pgdp) is represented by the ratio of urban GDP to urban population at the end of year (Narayan et al. 2016). (2) The level of finance development (Finance) is represented by the ratio of loan balances of financial institutions to urban GDP at the end of the year (Zhang, 2011). (3) Industrial structure (Ind) is represented by the ratio of the value-added of the tertiary industry to the value-added of the secondary industry (Deng et al. 2023). (4) The density of the population (Pd) is represented by the amount of the urban population at the end of the year/urban area (Wang and Li, 2021). (5) The level of opening-up (FDI) is represented by the ratio of the value of foreign direct investment to urban GDP (Paramati et al. 2017). (6) The expenditure of government science and technology expenditure (Ten) is represented by the ratio of government science and technology expenditure to the general public expenditure (Paramati et al. 2017). (7) The degree of government intervention (Gov) is represented by the ratio of urban public expenditure to urban GDP (Xiang et al. 2023). See Appendix A.2 for a more detailed description. Pgdp, Finance, Ind, and FDI have all been logarithmically transformed.

Data

This paper utilizes the balanced panel data of China’s 282 prefecture-level cities from 2006 to 2019, some missed values were filled by Random Forest Algorithm. In addition, the level of digital technology innovation comes from the Chinese Patent database, where digital patents are divided according to the definition of the China Urban Digital Economy Development Report 2021. Other data come from the China City Statistical Yearbook, China Urban and Rural Construction Statistical Yearbook, China Energy Statistical Yearbook, and China Regional Statistical Yearbook. Table 1 displays the descriptive statistics for most variables, which includes the sample size, mean, and standard deviation statistics for the entire dataset, as well as when classified according to “BigData”.

Table 1 Descriptive Statistics.

Carbon reduction effect test and robustness check

Baseline results

This section tests the impact of the National Big Data Comprehensive Pilot Zone Policy on carbon reduction, and the results are reported in Table 2, where we present three models, subject to the specifications of the control variables. The first column shows the result with no additional control variables but only city and year-fixed effects. The coefficient is −0.2433 and statistically significant at the 1% level, indicating a strong negative impact of NBDCPZ implementation on carbon emissions. To ensure the reliability of this effect, we control for other variables and adopt various models. Columns 2 and 3 show the negative effect or the reduction effect remains valid. After adding the controls like Pgdp, Finance, and Ind, the coefficient is −0.1902 and significant at the level of 1%. Focusing on Column 3, where all control variables are included, the coefficient for the BigData is −0.1905 and still statistically significant at the 1% level, indicating and pronounced effect, as the NBDCPZ implementation can reduce the carbon emissions. Jiang (2023) proposed that it is difficult to identify the causal effect in the subsample due to the omitted variable bias, thus we chose the progressive regression to realize the sensitive analysis. Following Bellows and Miguel (2009), we gauge the relative importance of omitted variable bias by investigating how the coefficients of BigData change with the inclusion of the additional explanatory variables. Compared with the coefficient in Column 1, it changes significantly in Column 2 after adding some core controls, however, the coefficient of BigData in Column 3 is almost the same as it is in Column 3, and the ratio of the “influence” of omitted variables relative to the observed control variables is 3.6498 and 3.6740, which means although the baseline result may be affected by the omitted variable bias, the ratio of “influence” can be ignored.

Table 2 Baseline.

Parallel trend test results

We conduct a parallel trend test of the Progressive DID model based on Beck et al. (2010) to see if the carbon emission reduction in the pilot cities before adopting the National Big Data Comprehensive Pilot Zone Policy shows the same trends as those of non-pilot implemented cities. An event analysis is conducted as Eq. (2):

$${\rm {Carbo{n}}}_{it}=\alpha +\mathop{\sum }\limits_{n=-9}^{3}{\beta }_{n}{\rm {BigDat{a}}}_{it}^{n}+\gamma {\rm {Contro{l}}}_{it}+{\mu }_{i}+{\lambda }_{t}+{\varepsilon }_{it}$$
(2)

where BigDatan is a dummy set of policy variables whether a city carried out the National Big Data Comprehensive Pilot Zone Policy before and after the policy introduction in the region. It equals 1 when the city is in the pilot zone, otherwise, its value is 0. Coefficient βn reflects whether the carbon emissions in the treated regions have the same trend as the controlled regions or not. Other variables set in this model are the same as in the baseline model.

Figure 2 indicates the outcomes of the parallel trend testing. We can see there is no significant difference in the trend of carbon emissions reduction between the treated regions and controlled regions before the establishment of the National Big Data Comprehensive Pilot Zone, and the parallel trend is further satisfied. Pilot cities show a significant downward trend in carbon emission reduction in all periods after introducing the policy compared to the non-pilot cities, and the coefficient is increasing year-by-year, which means the pilot zone can accelerate the progress of carbon reduction.

Fig. 2
figure 2

Pre-treatment trend event-study.

Robustness checks

Placebo test

This study uses a placebo test to confirm that the reduction of carbon emissions is the effect of the policy rather than the unobserved or omitted randomness based on the research of Chen and Yang (2019), Cao et al. (2021), and Sha (2023). We first drop the virtual treatment groups and then randomly select some cities as the false treatment groups of the National Big Data Comprehensive Pilot Zone. Treated cities and policy implementation years are randomly selected 5000 times to test whether the baseline regression results are robust. If there is no interaction relationship of carbon reduction between the false treatment groups and treatment groups, the computed coefficients of the 5000 false interactions will be insignificant, which means that the National Big Data Comprehensive Pilot Zone will not affect carbon emissions. The result of the placebo test is displayed in Fig. 3. We can see that the coefficients for the false interactions are distributed around 0 and close to a normal distribution. Therefore, the policy effect is robust, and not disturbed by random factors or omitted variables.

Fig. 3
figure 3

Placebo test.

Alternative explained variables

We conduct additional tests replacing the total amount of carbon emissions with carbon intensity, the regression result is presented in Column 1 in Table 4.

Alternative treatment groups

We dropped the first treatment group due to the simple amount limitation and changed the empirical period into 2011–2019, the regression result is presented in Columns 2 and 3 in Table 3.

Table 3 Other robustness tests 1.

Alternative estimation methods

This paper adopts another three estimation methods to check the robustness of the baseline regression results. According to Chernozhukov et al. (2017), the double machine learning (DML) method can address issues like model misspecification and endogeneity problems, which can release the strong assumption of DID, we employ the DML method using the LASSO algorithm for prediction and estimation (for details, see Appendix C.1), the regression result is presented in Column 4 in Table 3. Since the sensitive analysis assumes that the unobserved variables will not affect the baseline regression results. Guo et al. (2022) point out that using observational data to infer causal relationship may be invalid by the existence of hidden confounding, thus, we adopt Doubly Debiased Lasso (DDL) to de-bias the disruptive impact on the empirical results (for details, see Appendix C.2), the result obtained by DDL is presented in Column 5 in Table 3. We also notice that there will be a “period mismatch” problem resulting from the systematic difference between the treatment and control groups in panel data, then we use step-by-step PSM-DID method to mitigate this kind of mismatch issue. The regression result in Column 3 (see Table 4) shows that the carbon emissions decreased significantly after the policy implementation, which proves the finding of baseline regression remains robust.

Table 4 Other robustness tests 2.

Endogeneity problem

After dealing with the issues considered above, the possible endogeneity problems should not be ignored. This paper selects one period lagged of all variables as the instrument variables and uses Wild Cluster Bootstrap (Roodman et al. 2019) to handle the endogeneity issue for testing robust, the results are presented in Columns 1 and 2 in Table 4.

The DID model requires the policy implementation time and pilot cities should be exogeneity, which means they cannot be settled by expectations. The policy of the National Big Data Comprehensive Pilot Zone may be establishment under the requirements from other policies like “13th five-year plan” and “Action”, as well as the determination of pilot cities. In this paper, we use CSDID (Callaway and Sant’Anna, 2021) and SDID (Arkhangelsky et al. 2021) methods to figure out this endogeneity problem, the results in Columns 4 and 5 in Table 4 show that the estimated coefficients are still significantly negative, indicating no expected effect, which means the exogenous policy test passed.

Above all, the results in Tables 3 and 4 show that the carbon emissions reduction effect still holds and remains robust. Building on this, additional robustness checks, such as the inclusion of other control variables and the exclusion of the effects of other policies, can be found in Appendix B.1 and B.2.

Mechanism analysis

The construction of NBDCPZ in China is confirmed to have a significant carbon emission reduction effect in both the baseline regression analysis and robust checks. How is this effect realized? The channels through which this policy affects carbon emissions reduction are examined in this section. We notice that different mechanisms may have the opposite effect on carbon reduction. Based on the theoretical analysis in the section “Policy background”, the possible mechanisms are accelerating digital infrastructure development and promoting digital technology effect, that is the level of digital technology innovation. This section tests the mechanisms using Eqs. (3) and (4).

$${M}_{it}={\alpha }_{2}+{\beta }_{2}{\rm {BigDat{a}}}_{it}+{c}_{2}{\rm {Contro{l}}}_{it}+{\varepsilon }_{it2}$$
(3)
$${\rm {Carbo{n}}}_{it}={\alpha }_{3}+{\beta }_{3}{\rm {BigDat{a}}}_{it}+\gamma {M}_{it}+{c}_{3}{\rm {Contro{l}}}_{it}+{\varepsilon }_{it3}$$
(4)

M is the mechanism variable, where the digital infrastructure development (M1) is represented by a composite index obtained through the entropy method. Given data availability, we selected three key indicators, following the approach of Khan et al. (2022) and Che et al. (2024): total telecommunication traffic, the number of fixed-line telephone users, and the number of mobile phone users. And the level of digital technology innovation M2 is expressed as the amount of digital technology patents in cities (Nagaoka et al. 2010; Huang et al. 2023; Yang et al. 2024).

Column 1 in Table 5 shows that the construction of NBDCPZ significantly improves digital infrastructure development. Column 3 shows that digital infrastructure development is a powerful mechanism to increase city carbon emissions. The possible explanation is that the policy may give rise to the construction of digital infrastructures and such development will lead to a high energy consumption, which can have a negative effect on the carbon emission reduction. This finding validates Hypothesis 2.

Table 5 The result of the traditional intermediary stepwise test method.

As can be seen from Columns 2 and 4 in Table 5, the construction of NBDCPZ also increases the number of digital technology patents, and the regression efficiency is significant. The empirical result proves that the construction of NBDCPZ can efficiently improve digital technology innovation and promote the green development of the cities. Therefore, improving digital technology innovation is one of the powerful mechanisms for the NBDCPZ construction to release a dividend carbon emission reduction. This finding validates Hypothesis 3.

According to Jiang (2022), endogeneity problems such as the effect of hidden confounding in mechanism test should not be ignored, since it can make the results be biased. Although we use the control variables and fixed effect, the bias still exists. Thus, we adopt DDL (Guo et al., 2022) to remove the estimation bias in the mechanism model. The results in Table 6 prove that the mechanism tests are robust.

Table 6 Doubly debiased Lasso estimates.

Above all, this paper verifies two channels: digital infrastructure development and digital technology innovation. We notice that the continuous expansion of digital infrastructure results in profound and complex changes in the overall socio-economic situation, it is confirmed that NBDCPZ significantly advances the progress of digital infrastructure development, and this fact somehow increases carbon emissions. Otherwise, this policy can reduce carbon emissions through the channel of digital technology innovation effect. These results indicate that the government still needs to notice the negative effects brought by digital infrastructure development. From the digital industrialization side, it is driven by digital technology and continuously optimizes the city’s digital industrial structure. However, digital products and industries lead to increased energy consumption; for example, high-energy consumption areas such as data centers and Bitcoin mining encroach on energy usage space to some extent, continuously increasing the carbon footprint. At the same time, the large-scale deployment of infrastructure may result in a carbon lock-in effect, causing cities to remain trapped in high carbon emissions even if they achieve industrial transformation.

Causal path analysis

This paper addresses an issue that verifies the existence of the energy rebound effect based on the above discussions. To investigate which channel owes the dominating statute and whether the existence of rebound effect does matter, this paper adopts the average-treatment-effect method (Zhou, 2022; Zhou and Yamamoto, 2023) to examine the multiple causally ordered mediators’ pathways of the two channels (for details, see Appendix C.3). Figure 4 shows the causal path analysis diagram of chain mediation in general.

Fig. 4
figure 4

Causal path analysis diagram of chain mediation.

We conduct a weighting estimator to decompose the causal path and OLS to estimate. Table 7 shows the causal path decomposition results, they explain the direct and indirect effect of why NBDCPZ reduces carbon emissions in a particular way. The causal path represents “construction of NBDCPZ leads to digital infrastructure development, and the expansion of infrastructure improves the digital technology innovation, finally contributing to carbon emissions reduction”. Column 1 shows all kinds of causal paths which are decomposed from the average total effect, Column 2 reports the estimates of the average total effect, direct effect, and indirect effect.

Table 7 Mediation analysis: causal path decomposition.

The estimate for the indirect effect of mechanism one (M1) is small and only passes the 10% significance level, indicating that the negative effect on carbon emissions reduction is largely offset by the positive effect of digital technology innovation. Figure 5 represents a clearer explanation of this causal path, where the technical effect represents the direct effect of NBDCPZ on carbon reduction and the chain effect shows the hybrid effect of digital infrastructure development and digital technology innovation on carbon reduction. We hypothesize that the construction of new digital infrastructures provides solid conditions for digital technology innovation, which can directly improve the speed and quality of innovation, and somehow further mitigate the energy rebound effect. It is confirmed that although the single existence of digital infrastructure development will increase energy consumption, which has a negative impact on the progress of carbon reduction when combined with digital technology innovation, the hybrid effect can overall be positive.

Fig. 5
figure 5

A breakdown of the effects of carbon reduction.

Discussion: Policy Learning

This paper contributes to the policy implications based on a data-driven decision-making approach, policy learning model, which aims to learn a treatment assignment policy that satisfies resource-specific constraints (Athey and Wager, 2021). This model offers a fundamentally different approach from traditional cost-effectiveness analysis (CEA) in policy evaluation. CEA provides a static assessment of policy impact by measuring cost per unit of outcome improvement, whereas policy learning is a dynamic, data-driven method that continuously optimizes treatment assignments based on real-world observational data (Athey and Wager, 2021). Unlike CEA, which relies on fixed cost-benefit calculations, policy learning can iteratively refine policy decisions as more data becomes available, making it particularly useful in non-stationary policy environments. Consider the carbon emissions reduction maximization and resource allocation optimization principle, this section moves to learning a new treatment assignment policy that respects the resource constraints through involving observational data and policy shocks in our model (for details, see Appendix C.4). This model presents different react functions of all the individual cities and order first 38 new pilot cities (see Table 9) with the most obvious carbon reduction effect. We conduct a Generalized Random Forest (GRF) (Athey et al., 2019) algorithm to estimate the learned policies, the new pilot cities list by GRF display that the learned pilot zone construction policy can reduce 37.22% carbon emissions (see Column 1 in Table 8). Compared with the original policy choice, the effect of carbon emission reduction has been improved by 95.38%, effectively compensating for the 18.17% carbon emission reduction welfare loss caused by the original distribution rule. At the same time, Table 9 compares the carbon emission differences between the cities selected by GRF method and the cities selected by the original policy and finds that the co-selection rates of the cities selected by GRF and the cities selected by the policy are 31.58% respectively. This finding indicates that the cities selected through policy learning surpass the cities selected by the original policy in terms of carbon emission reduction benefits, and under the original policy framework, the carbon emission reduction of most cities is greater than zero, suggesting that the original policy may not have selected pilot cities. If the reduction in carbon emissions in cities is greater than zero, this may indicate that big data policies are counterproductive in promoting low-carbon development in cities, thus hindering the country’s progress towards the “dual carbon” goal. Figure 6 shows in detail the carbon emission of cities based on the estimation of generalized random forest. A total of 228 cities have achieved carbon emission reduction, and 54 cities have increased carbon emissions. The results reveal that in economically developed regions, such as Shanghai, Beijing, Dongguan, etc., the implementation of big data policies may lead to an increase in carbon emissions.

Table 8 Comparison of DID estimates between new allocation rules of cities and original policy choices.
Table 9 Optimizing Big Data Policy: a comparison of carbon reduction in cities selected by the original policy and the policy learning model.
Fig. 6
figure 6

GRF estimation of carbon emissions for each city.

The construction of a Big Data Pilot Zone can effectively lead to a transformation in city management and industrial upgrading from a high energy consumption development model to a new development model with technology-driven and energy efficiency, making these cities demonstrate strong capabilities in digital implementation by leveraging their developed infrastructure and mature market environment. However, this transformation is always accompanied by a significant increase in the demand for data processing and storage, prompting the expansion of data centers. As energy-intensive facilities, their operation inevitably increases energy consumption and carbon emissions. The acceleration of economic growth and the expansion of commercial activities and logistics demands further exacerbate carbon emissions in the transportation sector. When cities have better economic development, higher technological levels, and lower pollution levels, the process of digitalization can lead to an increase in carbon emissions.

In economically underdeveloped areas, due to the lower degree of industrial transformation and poor economic development, it is challenging to develop digital industries. The role of the Big Data Policy is mainly focused on promoting the digital transformation of the industry. This transformation process leads to carbon reduction by improving production efficiency and optimizing resource allocation.

Although we tend to believe that this allocation rule is more ideal, we must recognize that the main purpose of the “National Big Data Comprehensive Pilot Zones” is not limited to carbon reduction. It is more focused on using big data to promote the level of the digital economy in cities and to facilitate the digital transformation of enterprises. Therefore, the original allocation rule may have balanced efficiency and fairness. However, from the perspective of policy documents such as “dual carbon” and the “14th Five-Year Plan,” these documents all focus on addressing extreme climate, global warming, and other climate change challenges.

Based on this, identifying cities that can maximize “carbon reduction” efficiency through statistical methods is of great significance for China to achieve its “dual carbon” goals early and for the country’s transition to a low-carbon economy.

The result of policy learning indicates that the original pilot cities cannot realize carbon reduction maximization. It is important for Chinese governments to establish a systematic monitoring and evaluation mechanism that can assess the actual performance of big data policies in carbon emission reduction across the pilot zones. Provinces and cities that have established pilot zones should collect and analyze relevant data for each batch of policy implementation and continuously optimize treatment assignment policy. These measures enable precise policy adjustments, achieving an organic combination of “crossing the river by feeling the stones” and “top-level design,” thereby reducing the cost of policy execution and ensuring that the “National Big Data Comprehensive Pilot Zone” becomes a key strategic initiative in promoting low-carbon transformation. Moreover, big data policies should be tailored to the characteristics of pilot cities. For cities with high economic development, the focus should be on leveraging their influence in digital technology to enhance the economic and environmental benefits of surrounding cities. In contrast, less developed regions should pay more attention to the process of industrial digitalization, promoting the digital transformation of industries to improve production efficiency and economic benefits while reducing carbon emissions. This approach not only helps to narrow the gap with developed regions but also effectively reduces regional imbalances.

Conclusions and policy implications

Conclusions

This paper reveals that the “National Big Data Comprehensive Pilot Zones” policy has effectively reduced urban carbon emissions by approximately 19.05%. The policy indirectly suppresses its carbon reduction effects through the optimization of digital infrastructure, indicating the presence of a digital rebound effect. However, the enhancement of urban digital technology innovation levels has positively contributed to carbon emission reduction under the guidance of the policy, with this technological effect being the core mechanism for urban carbon reduction. The level of digital infrastructure is a crucial indicator of urban digital development, and its expansion reflects the city’s dependence on digital infrastructure to some extent.

Given the simultaneous presence of both rebound and technological effects, we are concerned about whether these two mechanisms interact. If they do, we are particularly interested in the severity of the rebound effect. Therefore, we employed causal path analysis (CPA) to consider both mechanisms within a single model. In our CPA, we found that the negative impact of the rebound effect is not severe. This may be because digital infrastructure is the foundation of technological innovation, providing essential public service facilities for digital transformation, intelligent upgrades, and integrated innovation. The improvement and expansion of these facilities enhance the quality and speed of digital technology innovation.

Additionally, in the discussion section, we further explore the role of policy learning. Through Policy Learning using the generalized random forest (GRF) method, we identified 38 new pilot cities with the most significant carbon reduction effects. The learned allocation policy demonstrates a 37.22% reduction in carbon emissions, improving the effectiveness of the original policy by 95.38% and compensating for the 18.17% welfare loss caused by the initial distribution rule. Furthermore, we believe that economically developed regions may experience increased emissions due to data center expansion and rising commercial activities, while underdeveloped areas benefit from improved efficiency and optimized resource allocation, leading to carbon reduction.

Policy implications

Digital technology innovation has had a positive effect on carbon emission reduction, emphasizing the importance of promoting technological innovation and application. Chinese governments should take measures to incentivize and support technological innovations that can directly or indirectly reduce carbon emissions, including providing financial subsidies, tax incentives, and support for research and development, thereby accelerating the research, development, and application of green technologies. Considering the heterogeneity of the effects of big data policies on carbon emission reduction in different regions, the refinement and adjustment of policies are particularly crucial.

Within China’s governance framework, local governments often grapple with balancing multiple objectives, a challenge compounded by their limited administrative capacity and attention spans (Chen and Jia, 2023). As a result, local officials tend to prioritize policy goals that are directly tied to career incentives, such as economic growth or those that are more straightforward and visible, like digital infrastructure expansion (Zhou, 2007; Chen et al. 2024). In contrast, objectives such as utilizing digitalization to curb carbon emissions, which are less immediately measurable and carry less weight in official evaluations, often receive lower priority. For example, local governments may redirect their focus toward more visible metrics or outcomes that promise immediate rewards, thereby sidelining fewer tangible goals like carbon reduction (Chen et al. 2024). This issue reflects a dual challenge: practical constraints arising from limited resources and attention, and institutional barriers embedded within the current evaluation and accountability systems, which inadequately incentivize efforts toward long-term sustainable practices.

To address the challenges of ensuring the effective implementation of carbon reduction objectives within local government frameworks, we propose enhancing the integration of digital infrastructure policies with environmental goals by establishing robust accountability mechanisms and encouraging inter-departmental collaboration. Performance evaluations should be refined to include specific metrics that measure the environmental impacts of digital infrastructure projects, such as energy efficiency improvements and emissions reductions, ensuring these projects align with broader carbon reduction goals. Additionally, capacity-building initiatives should be implemented to equip local officials with the necessary skills to utilize digital technologies for emissions tracking and energy optimization. By fostering collaboration between relevant departments and emphasizing clear, measurable environmental targets, these measures can help overcome practical constraints and institutional obstacles, ensuring that carbon reduction remains a priority alongside other tangible policy goals.