Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miss match between input and output data size of the normalize_with_expected_power() #341

Open
Matammanjunath opened this issue Aug 14, 2022 · 0 comments

Comments

@Matammanjunath
Copy link

Describe the bug
A clear and concise description of what the bug is.
I am performing simple degradation analysis using rdtools. I got a data size error while processing normalize_with_expected_power().

Full error message and traceback
Please copy/paste the entire error traceback, if applicable.

normalized, insolation = rdtools.normalize_with_expected_power(df[pwr_col],
                                                                modeled_power,
                                                                df[poa_col])
---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

[<ipython-input-67-e6496a97d32b>](https://localhost:8080/#) in <module>()
----> 1 df['normalized'] = normalized.values
      2 df['insolation'] = insolation.values

3 frames

[/usr/local/lib/python3.7/dist-packages/pandas/core/common.py](https://localhost:8080/#) in require_length_match(data, index)
    530     if len(data) != len(index):
    531         raise ValueError(
--> 532             "Length of values "
    533             f"({len(data)}) "
    534             "does not match length of index "

ValueError: Length of values (1578239) does not match length of index (1578240)

To Reproduce
Steps to reproduce the behavior.
In this case I have considered NIST dataset of a PV plant. Below, I am reproducing modified function for debugging. This basically to check how the data size is changing at various steps.


def normalize_with_expected_power(pv, power_expected, poa_global,
                                  pv_input='power'):
    '''
    Normalize PV power or energy based on expected PV power.

    Parameters
    ----------
    pv : pandas.Series
        Right-labeled time series PV energy or power. If energy, should *not*
        be cumulative, but only for preceding time step. Type (energy or power)
        must be specified in the ``pv_input`` parameter.
    power_expected : pandas.Series
        Right-labeled time series of expected PV power. (Note: Expected energy
        is not supported.)
    poa_global : pandas.Series
        Right-labeled time series of plane-of-array irradiance associated with
        ``expected_power``
    pv_input : str, {'power' or 'energy'}
        Specifies the type of input used for ``pv`` parameter. Default: 'power'

    Returns
    -------
    energy_normalized : pandas.Series
        Energy normalized based on ``power_expected``
    insolation : pandas.Series
        Insolation associated with each normalized point

    '''
    print("input pv shape is %s"%(pv.shape))
    print("input power_expected shape is %s"%(power_expected.shape))
    print("input POA shape is %s"%(poa_global.shape))
    freq = _check_series_frequency(pv, 'pv')
    print(pv.shape)
    print(power_expected.shape)
    if pv_input == 'power':
        energy = energy_from_power(pv, freq, power_type='right_labeled')
        print("Energy shape is %s"%(energy.shape))
    elif pv_input == 'energy':
        energy = pv.copy()
        energy.name = 'energy_Wh'
    else:
        raise ValueError("Unexpected value for pv_input. pv_input should be 'power' or 'energy'.")

    model_tds, mean_model_td = _delta_index(power_expected)
    print("Model TDS shape is %s"%(model_tds.shape))
    measure_tds, mean_measure_td = _delta_index(energy)
    print("Measure TDS shape is %s"%(measure_tds.shape))

    # Case in which the model less frequent than the measurements
    if mean_model_td > mean_measure_td:
        power_expected = interpolate(power_expected, pv.index)
        print("Power expected shape is %s"%(power_expected.shape))
        poa_global = interpolate(poa_global, pv.index)
        print("POA shape is %s"%(poa_global.shape))

    energy_expected = energy_from_power(power_expected, freq, power_type='right_labeled')
    print("Energy expected shape is %s"%(energy_expected.shape))
    insolation = energy_from_power(poa_global, freq, power_type='right_labeled')
    print("Insolation shape is %s"%(insolation.shape))

    energy_normalized = energy / energy_expected
    print("Energy normalized shape is %s"%(energy_normalized.shape))

    index_union = energy_normalized.index.union(insolation.index)
    print("index_union shape is %s"%(index_union.shape))
    energy_normalized = energy_normalized.reindex(index_union)
    print("energy_normalized shape is %s"%(energy_normalized.shape))
    insolation = insolation.reindex(index_union)
    print("insolation shape is %s"%(insolation.shape))

    return energy_normalized, insolation

Reran the above function with required data

normalized, insolation = normalize_with_expected_power(df[pwr_col],modeled_power,df[poa_col])
Size of the input and output data of the above function are:

**input data size 1578240**
df.index 
DatetimeIndex(['2015-01-01 00:00:00-05:00', '2015-01-01 00:01:00-05:00',
               ....
               '2017-12-31 23:58:00-05:00', '2017-12-31 23:59:00-05:00'],
              dtype='datetime64[ns, pytz.FixedOffset(-300)]', length=**1578240**, freq='T')
modeled_power.index
DatetimeIndex(['2015-01-01 00:00:00-05:00', '2015-01-01 00:01:00-05:00',
               ....
               '2017-12-31 23:58:00-05:00', '2017-12-31 23:59:00-05:00'],
              dtype='datetime64[ns, pytz.FixedOffset(-300)]', length=**1578240**, freq='T')

**output data size 1578239**
normalized.index
DatetimeIndex(['2015-01-01 00:01:00-05:00', '2015-01-01 00:02:00-05:00',
               ....
               '2017-12-31 23:58:00-05:00', '2017-12-31 23:59:00-05:00'],
              dtype='datetime64[ns, pytz.FixedOffset(-300)]', length=**1578239**, freq='T')


Change of data size inside the normalize_with_expected_power() function. As we, see, the culprit here is energy shape becoming one less than input data size; this comes out from the energy_from_power() function and I dug further inside this function, I found the root culprit is the _aggregate() function.

input pv shape is 1578240
input power_expected shape is 1578240
input POA shape is 1578240
(1578240,)
(1578240,)
Energy shape is 1578239
Model TDS shape is 1578240
Measure TDS shape is 1578239
Energy expected shape is 1578239
Insolation shape is 1578239
Energy normalized shape is 1578239
index_union shape is 1578239
energy_normalized shape is 1578239
insolation shape is 1578239

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant