Ndtyjky

Question

How do I get the exponential weighted moving average in NumPy just like the following in pandas?

import pandas as pd

import pandas_datareader as pdr

from datetime import datetime



# Declare variables

ibm = pdr.get_data_yahoo(symbols='IBM', start=datetime(2000, 1, 1), end=datetime(2012, 1, 1)).reset_index(drop=True)['Adj Close']

windowSize = 20



# Get PANDAS exponential weighted moving average

ewm_pd = pd.DataFrame(ibm).ewm(span=windowSize, min_periods=windowSize).mean().as_matrix()



print(ewm_pd)

I tried the following with NumPy

import numpy as np

import pandas_datareader as pdr

from datetime import datetime



# From this post: http://stackoverflow.com/a/40085052/3293881 by @Divakar

def strided_app(a, L, S): # Window len = L, Stride len/stepsize = S

    nrows = ((a.size - L) // S) + 1

    n = a.strides[0]

    return np.lib.stride_tricks.as_strided(a, shape=(nrows, L), strides=(S * n, n))



def numpyEWMA(price, windowSize):

    weights = np.exp(np.linspace(-1., 0., windowSize))

    weights /= weights.sum()



    a2D = strided_app(price, windowSize, 1)



    returnArray = np.empty((price.shape[0]))

    returnArray.fill(np.nan)

    for index in (range(a2D.shape[0])):

        returnArray[index + windowSize-1] = np.convolve(weights, a2D[index])[windowSize - 1:-windowSize + 1]

    return np.reshape(returnArray, (-1, 1))



# Declare variables

ibm = pdr.get_data_yahoo(symbols='IBM', start=datetime(2000, 1, 1), end=datetime(2012, 1, 1)).reset_index(drop=True)['Adj Close']

windowSize = 20



# Get NumPy exponential weighted moving average

ewma_np = numpyEWMA(ibm, windowSize)



print(ewma_np)

But the results are not similar as the ones in pandas.

Is there maybe a better approach to calculate the exponential weighted moving average directly in NumPy and get the exact same result as the pandas.ewm().mean()?

At 60,000 requests on pandas solution, I get about 230 seconds. I am sure that with a pure NumPy, this can be decreased significantly.

Why not just use the working Pandas code that you already have? — Mar 18 '17 at 1:48
because the performance is slower with Pandas and i look for alternatives with vectorized numpy — Mar 18 '17 at 1:52
OK, so over there you have apply() which is effectively a Python loop. No surprise it was slow. But in this question you have no loop, it is already vectorized. So I think you are looking to solve a problem which may not exist. — Mar 18 '17 at 2:58
Offering a bounty won't make your question magically answerable. Please put together a Minimal, Complete, and Verifiable example. — Mar 20 '17 at 9:17

score 3 · Accepted Answer · 2019-02-19 22:42:14Z

Updated 27/11/2018

WORKING PURE NUMPY, FAST & VECTORIZED SOLUTION FOR LARGE INPUTS

out parameter for in-place computation,
dtype parameter,
index order parameter

@Divakar's answer leads to floating point precision problems when the input is too large. This is because (1-alpha)**(n+1) -> 0 when n -> inf and alpha -> 1, leading to divide-by-zero's and NaN values popping up in the calculation.

Here is my fastest solution with no precision problems, nearly fully vectorized. It's gotten a little complicated but the performance is great, especially for really huge inputs. Without using in-place calculations (which is possible using the out parameter, saving memory allocation time): 3.62 seconds for 100M element input vector, 3.2ms for a 100K element input vector, and 293µs for a 5000 element input vector on a pretty old PC (results will vary with different alpha/row_size values).

# tested with python3 & numpy 1.15.2

import numpy as np



def ewma_vectorized_safe(data, alpha, row_size=None, dtype=None, order='C', out=None):

    """

    Reshapes data before calculating EWMA, then iterates once over the rows

    to calculate the offset without precision issues

    :param data: Input data, will be flattened.

    :param alpha: scalar float in range (0,1)

        The alpha parameter for the moving average.

    :param row_size: int, optional

        The row size to use in the computation. High row sizes need higher precision,

        low values will impact performance. The optimal value depends on the

        platform and the alpha being used. Higher alpha values require lower

        row size. Default depends on dtype.

    :param dtype: optional

        Data type used for calculations. Defaults to float64 unless

        data.dtype is float32, then it will use float32.

    :param order: {'C', 'F', 'A'}, optional

        Order to use when flattening the data. Defaults to 'C'.

    :param out: ndarray, or None, optional

        A location into which the result is stored. If provided, it must have

        the same shape as the desired output. If not provided or `None`,

        a freshly-allocated array is returned.

    :return: The flattened result.

    """

    data = np.array(data, copy=False)



    if dtype is None:

        if data.dtype == np.float32:

            dtype = np.float32

        else:

            dtype = np.float

    else:

        dtype = np.dtype(dtype)



    row_size = int(row_size) if row_size is not None 

               else get_max_row_size(alpha, dtype)



    if data.size <= row_size:

        # The normal function can handle this input, use that

        return ewma_vectorized(data, alpha, dtype=dtype, order=order, out=out)



    if data.ndim > 1:

        # flatten input

        data = np.reshape(data, -1, order=order)



    if out is None:

        out = np.empty_like(data, dtype=dtype)

    else:

        assert out.shape == data.shape

        assert out.dtype == dtype



    row_n = int(data.size // row_size)  # the number of rows to use

    trailing_n = int(data.size % row_size)  # the amount of data leftover

    first_offset = data[0]



    if trailing_n > 0:

        # set temporary results to slice view of out parameter

        out_main_view = np.reshape(out[:-trailing_n], (row_n, row_size))

        data_main_view = np.reshape(data[:-trailing_n], (row_n, row_size))

    else:

        out_main_view = out

        data_main_view = data



    # get all the scaled cumulative sums with 0 offset

    ewma_vectorized_2d(data_main_view, alpha, axis=1, offset=0, dtype=dtype,

                       order='C', out=out_main_view)



    scaling_factors = (1 - alpha) ** np.arange(1, row_size + 1)

    last_scaling_factor = scaling_factors[-1]



    # create offset array

    offsets = np.empty(out_main_view.shape[0], dtype=dtype)

    offsets[0] = first_offset

    # iteratively calculate offset for each row

    for i in range(1, out_main_view.shape[0]):

        offsets[i] = offsets[i - 1] * last_scaling_factor + out_main_view[i - 1, -1]



    # add the offsets to the result

    out_main_view += offsets[:, np.newaxis] * scaling_factors[np.newaxis, :]



    if trailing_n > 0:

        # process trailing data in the 2nd slice of the out parameter

        ewma_vectorized(data[-trailing_n:], alpha, offset=out_main_view[-1, -1],

                        dtype=dtype, order='C', out=out[-trailing_n:])

    return out



def get_max_row_size(alpha, dtype=float):

    assert 0. <= alpha < 1.

    # This will return the maximum row size possible on 

    # your platform for the given dtype. I can find no impact on accuracy

    # at this value on my machine.

    # Might not be the optimal value for speed, which is hard to predict

    # due to numpy's optimizations

    # Use np.finfo(dtype).eps if you  are worried about accuracy

    # and want to be extra safe.

    epsilon = np.finfo(dtype).tiny

    # If this produces an OverflowError, make epsilon larger

    return int(np.log(epsilon)/np.log(1-alpha)) + 1

The 1D ewma function:

def ewma_vectorized(data, alpha, offset=None, dtype=None, order='C', out=None):

    """

    Calculates the exponential moving average over a vector.

    Will fail for large inputs.

    :param data: Input data

    :param alpha: scalar float in range (0,1)

        The alpha parameter for the moving average.

    :param offset: optional

        The offset for the moving average, scalar. Defaults to data[0].

    :param dtype: optional

        Data type used for calculations. Defaults to float64 unless

        data.dtype is float32, then it will use float32.

    :param order: {'C', 'F', 'A'}, optional

        Order to use when flattening the data. Defaults to 'C'.

    :param out: ndarray, or None, optional

        A location into which the result is stored. If provided, it must have

        the same shape as the input. If not provided or `None`,

        a freshly-allocated array is returned.

    """

    data = np.array(data, copy=False)



    if dtype is None:

        if data.dtype == np.float32:

            dtype = np.float32

        else:

            dtype = np.float64

    else:

        dtype = np.dtype(dtype)



    if data.ndim > 1:

        # flatten input

        data = data.reshape(-1, order)



    if out is None:

        out = np.empty_like(data, dtype=dtype)

    else:

        assert out.shape == data.shape

        assert out.dtype == dtype



    if data.size < 1:

        # empty input, return empty array

        return out



    if offset is None:

        offset = data[0]



    alpha = np.array(alpha, copy=False).astype(dtype, copy=False)



    # scaling_factors -> 0 as len(data) gets large

    # this leads to divide-by-zeros below

    scaling_factors = np.power(1. - alpha, np.arange(row_size + 1, dtype=dtype),

                               dtype=dtype)

    # create cumulative sum array

    np.multiply(data, (alpha * scaling_factors[-2]) / scaling_factors[:-1],

                dtype=dtype, out=out)

    np.cumsum(out, dtype=dtype, out=out)



    # cumsums / scaling

    out /= scaling_factors[-2::-1]



    if offset != 0:

        offset = np.array(offset, copy=False).astype(dtype, copy=False)

        # add offsets

        out += offset * scaling_factors[1:]



    return out

The 2D ewma function:

def ewma_vectorized_2d(data, alpha, axis=None, offset=None, dtype=None, order='C', out=None):

    """

    Calculates the exponential moving average over a given axis.

    :param data: Input data, must be 1D or 2D array.

    :param alpha: scalar float in range (0,1)

        The alpha parameter for the moving average.

    :param axis: The axis to apply the moving average on.

        If axis==None, the data is flattened.

    :param offset: optional

        The offset for the moving average. Must be scalar or a

        vector with one element for each row of data. If set to None,

        defaults to the first value of each row.

    :param dtype: optional

        Data type used for calculations. Defaults to float64 unless

        data.dtype is float32, then it will use float32.

    :param order: {'C', 'F', 'A'}, optional

        Order to use when flattening the data. Ignored if axis is not None.

    :param out: ndarray, or None, optional

        A location into which the result is stored. If provided, it must have

        the same shape as the desired output. If not provided or `None`,

        a freshly-allocated array is returned.

    """

    data = np.array(data, copy=False)



    assert data.ndim <= 2



    if dtype is None:

        if data.dtype == np.float32:

            dtype = np.float32

        else:

            dtype = np.float64

    else:

        dtype = np.dtype(dtype)



    if out is None:

        out = np.empty_like(data, dtype=dtype)

    else:

        assert out.shape == data.shape

        assert out.dtype == dtype



    if data.size < 1:

        # empty input, return empty array

        return out



    if axis is None or data.ndim < 2:

        # use 1D version

        if isinstance(offset, np.ndarray):

            offset = offset[0]

        return ewma_vectorized(data, alpha, offset, dtype=dtype, order=order,

                               out=out)



    assert -data.ndim <= axis < data.ndim



    # create reshaped data views

    out_view = out

    if axis < 0:

        axis = data.ndim - int(axis)



    if axis == 0:

        # transpose data views so columns are treated as rows

        data = data.T

        out_view = out_view.T



    if offset is None:

        # use the first element of each row as the offset

        offset = np.copy(data[:, 0])

    elif np.size(offset) == 1:

        offset = np.reshape(offset, (1,))



    alpha = np.array(alpha, copy=False).astype(dtype, copy=False)



    # calculate the moving average

    row_size = data.shape[1]

    row_n = data.shape[0]

    scaling_factors = np.power(1. - alpha, np.arange(row_size + 1, dtype=dtype),

                               dtype=dtype)

    # create a scaled cumulative sum array

    np.multiply(

        data,

        np.multiply(alpha * scaling_factors[-2], np.ones((row_n, 1), dtype=dtype),

                    dtype=dtype)

        / scaling_factors[np.newaxis, :-1],

        dtype=dtype, out=out_view

    )

    np.cumsum(out_view, axis=1, dtype=dtype, out=out_view)

    out_view /= scaling_factors[np.newaxis, -2::-1]



    if not (np.size(offset) == 1 and offset == 0):

        offset = offset.astype(dtype, copy=False)

        # add the offsets to the scaled cumulative sums

        out_view += offset[:, np.newaxis] * scaling_factors[np.newaxis, 1:]



    return out

usage:

data_n = 100000000

data = ((0.5*np.random.randn(data_n)+0.5) % 1) * 100

window = 5000

sum_proportion = .875

alpha = 1 - np.exp(np.log(1 - sum_proportion) / window)  # or 2/(window+1) for panda's span function

result = ewma_vectorized_safe(data, alpha)

Just a tip

It is easy to calculate a 'window size' (technically exponential averages have infinite 'windows') for a given alpha, dependent on the contribution of the data in that window to the average. This is useful for example to chose how much of the start of the result to treat as unreliable due to border effects.

sum_proportion = .99  # window covers 99% of contribution to the moving average

scale_factors = (1 - alpha) ** (np.arange(data.shape[0] + 1))

scale_factor_cumsum = np.cumsum(scale_factors)

# window_size is the index of the first partial sum of scale_factors

# where partial_sum > sum_proportion * total_sum.

# Increases with increased sum_proportion and decreased alpha

window_size = np.argmax(scale_factor_cumsum > sum_proportion * scale_factor_cumsum[-1])

or more math-y but efficiently:

def window_size(alpha, sum_proportion):

    # solve (1-alpha)**window_size = (1-sum_proportion) for window_size        

    return int(np.log(1-sum_proportion) / np.log(1-alpha))

The alpha = 2 / (window_size + 1.0) relation used in this thread (the 'span' option from pandas) is a very rough approximation of the inverse of the above function (with sum_proportion~=0.87). alpha = 1 - np.exp(np.log(1-sum_proportion)/window_size) is more accurate (the 'half-life' option from pandas equals this formula with sum_proportion=0.5).

In the following example, data represents a continuous noisy signal. cutoff_idx is the first position in result where at least 99% of the value is dependent on separate values in data (i.e. less than 1% depends on data[0]). The data up to cutoff_idx is excluded from the final results because it is too dependent on the first value in data, therefore possibly skewing the average.

result = ewma_vectorized_safe(data, alpha, chunk_size)

sum_proportion = .99

cutoff_idx = window_size(alpha, sum_proportion)

result = result[cutoff_idx:]

To illustrate the problem the above solve you can run this a few times, notice the often-appearing false start of the red line, which is skipped after cutoff_idx:

data_n = 100000

data = np.random.rand(data_n) * 100

window = 1000

chunk_size = 10000

sum_proportion = .99

alpha = 1 - np.exp(np.log(1-sum_proportion)/window)



result = ewma_vectorized_safe(data, alpha, chunk_size)



cutoff_idx = window_size(alpha, sum_proportion)

x = np.arange(start=0, stop=result.size)



import matplotlib.pyplot as plt

plt.plot(x[:cutoff_idx+1], result[:cutoff_idx+1], '-r',

         x[cutoff_idx:], result[cutoff_idx:], '-b')

plt.show()

note that cutoff_idx==window because alpha was set with the inverse of the window_size() function, with the same sum_proportion.

Peter Mortensen 13.7k1986113 · Accepted Answer · 2018-11-17 15:24:04Z

I think I have finally cracked it!

Here's a vectorized version of numpy_ewma function that's claimed to be producing the correct results from @RaduS's post -

def numpy_ewma_vectorized(data, window):



    alpha = 2 /(window + 1.0)

    alpha_rev = 1-alpha



    scale = 1/alpha_rev

    n = data.shape[0]



    r = np.arange(n)

    scale_arr = scale**r

    offset = data[0]*alpha_rev**(r+1)

    pw0 = alpha*alpha_rev**(n-1)



    mult = data*pw0*scale_arr

    cumsums = mult.cumsum()

    out = offset + cumsums*scale_arr[::-1]

    return out

Further boost

We can boost it further with some code re-use, like so -

def numpy_ewma_vectorized_v2(data, window):



    alpha = 2 /(window + 1.0)

    alpha_rev = 1-alpha

    n = data.shape[0]



    pows = alpha_rev**(np.arange(n+1))



    scale_arr = 1/pows[:-1]

    offset = data[0]*pows[1:]

    pw0 = alpha*alpha_rev**(n-1)



    mult = data*pw0*scale_arr

    cumsums = mult.cumsum()

    out = offset + cumsums*scale_arr[::-1]

    return out

Runtime test

Let's time these two against the same loopy function for a big dataset.

In [97]: data = np.random.randint(2,9,(5000))

    ...: window = 20

    ...:



In [98]: np.allclose(numpy_ewma(data, window), numpy_ewma_vectorized(data, window))

Out[98]: True



In [99]: np.allclose(numpy_ewma(data, window), numpy_ewma_vectorized_v2(data, window))

Out[99]: True



In [100]: %timeit numpy_ewma(data, window)

100 loops, best of 3: 6.03 ms per loop



In [101]: %timeit numpy_ewma_vectorized(data, window)

1000 loops, best of 3: 665 µs per loop



In [102]: %timeit numpy_ewma_vectorized_v2(data, window)

1000 loops, best of 3: 357 µs per loop



In [103]: 6030/357.0

Out[103]: 16.89075630252101

There is around a 17 times speedup!

that's it :) This one is both super fast and correct. Thank you again @Divakar. Is it allowed to change the right answer on stackoverflow? as well as i would like to award this answer with the bounty as it meets my initial requirements of beating pandas performance — Mar 21 '17 at 12:40
@RaduS Yes, changing the accept on answers to some other answer is allowed, if you are getting a better one :) — Mar 21 '17 at 12:42
@RaduS Appreciate the bounty! Was an interesting and quite engrossing problem this one, given the inherent recursion in your loopy version, but was worth the fun! :) — Mar 21 '17 at 12:45
This code doesn't work if the input vector is too large. For example, ewma( np.ones(10000), span=10) gives all nans. This occurs because at some point pows becomes all zeros and scale_arr becomes all nans. — Sep 19 '18 at 22:13

Peter Mortensen 13.7k1986113 · Accepted Answer · 2018-11-17 15:26:03Z

Here is an implementation using NumPy that is equivalent to using df.ewm(alpha=alpha).mean(). After reading the documentation, it is just a few matrix operations. The trick is constructing the right matrices.

It is worth noting that because we are creating float matrices, you can quickly eat through your memory if the input array is too large.

import pandas as pd

import numpy as np



def ewma(x, alpha):

    '''

    Returns the exponentially weighted moving average of x.



    Parameters:

    -----------

    x : array-like

    alpha : float {0 <= alpha <= 1}



    Returns:

    --------

    ewma: numpy array

          the exponentially weighted moving average

    '''

    # Coerce x to an array

    x = np.array(x)

    n = x.size



    # Create an initial weight matrix of (1-alpha), and a matrix of powers

    # to raise the weights by

    w0 = np.ones(shape=(n,n)) * (1-alpha)

    p = np.vstack([np.arange(i,i-n,-1) for i in range(n)])



    # Create the weight matrix

    w = np.tril(w0**p,0)



    # Calculate the ewma

    return np.dot(w, x[::np.newaxis]) / w.sum(axis=1)

Let's test its:

alpha = 0.55

x = np.random.randint(0,30,15)

df = pd.DataFrame(x, columns=['A'])

df.ewm(alpha=alpha).mean()



# returns:

#             A

# 0   13.000000

# 1   22.655172

# 2   20.443268

# 3   12.159796

# 4   14.871955

# 5   15.497575

# 6   20.743511

# 7   20.884818

# 8   24.250715

# 9   18.610901

# 10  17.174686

# 11  16.528564

# 12  17.337879

# 13   7.801912

# 14  12.310889



ewma(x=x, alpha=alpha)



# returns:

# array([ 13.        ,  22.65517241,  20.44326778,  12.1597964 ,

#        14.87195534,  15.4975749 ,  20.74351117,  20.88481763,

#        24.25071484,  18.61090129,  17.17468551,  16.52856393,

#        17.33787888,   7.80191235,  12.31088889])

One more thing you can change here is that instead of providing the alpha, you can write alpha=2/(winsowSize+1) at the beginning of the function and then provide the windowSize instead just like in pandas function — Mar 20 '17 at 17:22
is there any other way to calculate p = np.vstack([np.arange(i,i-n,-1) for i in range(n)]) by avoiding the np.vstack looping and the w = np.tril(w0**p,0)? This np.vstack looping and np.tril function seems to be making the entire function perform extremely slow when compared to pandas solution — Mar 20 '17 at 18:05

score 7 · Accepted Answer · 2018-11-27 14:05:30Z

Fastest EWMA 23x `pandas`

The question is strictly asking for a numpy solution, however, it seems that the OP was actually just after a pure numpy solution to speed up runtime.

I solved a similar problem but instead looked towards numba.jit which massively speeds the compute time

In [24]: a = np.random.random(10**7)

    ...: df = pd.Series(a)

In [25]: %timeit numpy_ewma(a, 10)               # /a/42915307/4013571

    ...: %timeit df.ewm(span=10).mean()          # pandas

    ...: %timeit numpy_ewma_vectorized_v2(a, 10) # best w/o numba: /a/42926270/4013571

    ...: %timeit _ewma(a, 10)                    # fastest accurate (below)

    ...: %timeit _ewma_infinite_hist(a, 10)      # fastest overall (below)

4.14 s ± 116 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

991 ms ± 52.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 

396 ms ± 8.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

181 ms ± 1.01 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)   

39.6 ms ± 979 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Scaling down to smaller arrays of a = np.random.random(100) (results in the same order)

41.6 µs ± 491 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

945 ms ± 12 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

16 µs ± 93.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

1.66 µs ± 13.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

1.14 µs ± 5.57 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

It is also worth pointing out that my functions below are identically aligned to the pandas (see the examples in docstr), whereas a few of the answers here take various different approximations. For example,

In [57]: print(pd.DataFrame([1,2,3]).ewm(span=2).mean().values.ravel())

    ...: print(numpy_ewma_vectorized_v2(np.array([1,2,3]), 2))

    ...: print(numpy_ewma(np.array([1,2,3]), 2))

[1.         1.75       2.61538462]

[1.         1.66666667 2.55555556]

[1.         1.18181818 1.51239669]

The source code which I have documented for my own library

import numpy as np

from numba import jit

from numba import float64

from numba import int64





@jit((float64[:], int64), nopython=True, nogil=True)

def _ewma(arr_in, window):

    r"""Exponentialy weighted moving average specified by a decay ``window``

    to provide better adjustments for small windows via:



        y[t] = (x[t] + (1-a)*x[t-1] + (1-a)^2*x[t-2] + ... + (1-a)^n*x[t-n]) /

               (1 + (1-a) + (1-a)^2 + ... + (1-a)^n).



    Parameters

    ----------

    arr_in : np.ndarray, float64

        A single dimenisional numpy array

    window : int64

        The decay window, or 'span'



    Returns

    -------

    np.ndarray

        The EWMA vector, same length / shape as ``arr_in``



    Examples

    --------

    >>> import pandas as pd

    >>> a = np.arange(5, dtype=float)

    >>> exp = pd.DataFrame(a).ewm(span=10, adjust=True).mean()

    >>> np.array_equal(_ewma_infinite_hist(a, 10), exp.values.ravel())

    True

    """

    n = arr_in.shape[0]

    ewma = np.empty(n, dtype=float64)

    alpha = 2 / float(window + 1)

    w = 1

    ewma_old = arr_in[0]

    ewma[0] = ewma_old

    for i in range(1, n):

        w += (1-alpha)**i

        ewma_old = ewma_old*(1-alpha) + arr_in[i]

        ewma[i] = ewma_old / w

    return ewma





@jit((float64[:], int64), nopython=True, nogil=True)

def _ewma_infinite_hist(arr_in, window):

    r"""Exponentialy weighted moving average specified by a decay ``window``

    assuming infinite history via the recursive form:



        (2) (i)  y[0] = x[0]; and

            (ii) y[t] = a*x[t] + (1-a)*y[t-1] for t>0.



    This method is less accurate that ``_ewma`` but

    much faster:



        In [1]: import numpy as np, bars

           ...: arr = np.random.random(100000)

           ...: %timeit bars._ewma(arr, 10)

           ...: %timeit bars._ewma_infinite_hist(arr, 10)

        3.74 ms ± 60.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

        262 µs ± 1.54 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)



    Parameters

    ----------

    arr_in : np.ndarray, float64

        A single dimenisional numpy array

    window : int64

        The decay window, or 'span'



    Returns

    -------

    np.ndarray

        The EWMA vector, same length / shape as ``arr_in``



    Examples

    --------

    >>> import pandas as pd

    >>> a = np.arange(5, dtype=float)

    >>> exp = pd.DataFrame(a).ewm(span=10, adjust=False).mean()

    >>> np.array_equal(_ewma_infinite_hist(a, 10), exp.values.ravel())

    True

    """

    n = arr_in.shape[0]

    ewma = np.empty(n, dtype=float64)

    alpha = 2 / float(window + 1)

    ewma[0] = arr_in[0]

    for i in range(1, n):

        ewma[i] = arr_in[i] * alpha + ewma[i-1] * (1 - alpha)

    return ewma

score 6 · Accepted Answer · 2017-03-20 20:20:48Z

Given alpha and windowSize, here's an approach to simulate the corresponding behavior on NumPy -

def numpy_ewm_alpha(a, alpha, windowSize):

    wghts = (1-alpha)**np.arange(windowSize)

    wghts /= wghts.sum()

    out = np.full(df.shape[0],np.nan)

    out[windowSize-1:] = np.convolve(a,wghts,'valid')

    return out

Sample runs for verification -

In [54]: alpha = 0.55

    ...: windowSize = 20

    ...: 



In [55]: df = pd.DataFrame(np.random.randint(2,9,(100)))



In [56]: out0 = df.ewm(alpha = alpha, min_periods=windowSize).mean().as_matrix().ravel()

    ...: out1 = numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)

    ...: print "Max. error : " + str(np.nanmax(np.abs(out0 - out1)))

    ...: 

Max. error : 5.10531254605e-07



In [57]: alpha = 0.75

    ...: windowSize = 30

    ...: 



In [58]: out0 = df.ewm(alpha = alpha, min_periods=windowSize).mean().as_matrix().ravel()

    ...: out1 = numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)

    ...: print "Max. error : " + str(np.nanmax(np.abs(out0 - out1)))



Max. error : 8.881784197e-16

Runtime test on bigger dataset -

In [61]: alpha = 0.55

    ...: windowSize = 20

    ...: 



In [62]: df = pd.DataFrame(np.random.randint(2,9,(10000)))



In [63]: %timeit df.ewm(alpha = alpha, min_periods=windowSize).mean()

1000 loops, best of 3: 851 µs per loop



In [64]: %timeit numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)

1000 loops, best of 3: 204 µs per loop

Further boost

For further performance boost we could avoid the initialization with NaNs and instead use the array outputted from np.convolve, like so -

def numpy_ewm_alpha_v2(a, alpha, windowSize):

    wghts = (1-alpha)**np.arange(windowSize)

    wghts /= wghts.sum()

    out = np.convolve(a,wghts)

    out[:windowSize-1] = np.nan

    return out[:a.size]

Timings -

In [117]: alpha = 0.55

     ...: windowSize = 20

     ...: 



In [118]: df = pd.DataFrame(np.random.randint(2,9,(10000)))



In [119]: %timeit numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)

1000 loops, best of 3: 204 µs per loop



In [120]: %timeit numpy_ewm_alpha_v2(df.values.ravel(), alpha = alpha, windowSize = windowSize)

10000 loops, best of 3: 195 µs per loop

the speed is indeed incredible, beating pretty much all solutions i have up until now. I can live with the small error though it could generate significant error results down the pipeline. I will look into it a bit more deeply. I have another solution i came up with and will post it as an asnwer. It doesn't beat your answer but the speed is much better than pandas and does not have such a big error — Mar 20 '17 at 22:31
I do not understand completely why is there an error margin? is because of the different computing methods? — Mar 20 '17 at 22:33
@RaduS Are you referring to the Max. error : that I am showing in my tests? — Mar 20 '17 at 22:35
yes. i took and plotted the moving averages from your solution, from pandas and from Exponential Moving Average Formula There is this slight difference between your solution and the other which is even visible on the graph that i do not completely understand why. — Mar 20 '17 at 22:40
@RaduS Hmm how big is that error, as in could you get us the max. abs. differentiation value between my solution and the formula based one? — Mar 20 '17 at 22:45

Peter Mortensen 13.7k1986113 · Accepted Answer · 2018-11-17 15:27:47Z

Here is another solution O came up with in the meantime. It is about four times faster than the pandas solution.

def numpy_ewma(data, window):

    returnArray = np.empty((data.shape[0]))

    returnArray.fill(np.nan)

    e = data[0]

    alpha = 2 / float(window + 1)

    for s in range(data.shape[0]):

        e =  ((data[s]-e) *alpha ) + e

        returnArray[s] = e

    return returnArray

I used this formula as a starting point. I am sure that this can be improved even more, but it is at least a starting point.

DannyDanny 312 · Accepted Answer · 2017-07-26 09:13:53Z

@Divakar's answer seems to cause overflow when dealing with

numpy_ewma_vectorized(np.random.random(500000), 10)

What I have been using is:

def EMA(input, time_period=10): # For time period = 10

    t_ = time_period - 1

    ema = np.zeros_like(input,dtype=float)

    multiplier = 2.0 / (time_period + 1)

    #multiplier = 1 - multiplier

    for i in range(len(input)):

        # Special Case

        if i > t_:

            ema[i] = (input[i] - ema[i-1]) * multiplier + ema[i-1]

        else:

            ema[i] = np.mean(input[:i+1])

    return ema

However, this is way slower than the panda solution:

from pandas import ewma as pd_ema

def EMA_fast(X, time_period = 10):

    out = pd_ema(X, span=time_period, min_periods=time_period)

    out[:time_period-1] = np.cumsum(X[:time_period-1]) / np.asarray(range(1,time_period))

    return out

Peter Mortensen 13.7k1986113 · Accepted Answer · 2018-11-17 15:28:53Z

This answer may seem irrelevant. But, for those who also need to calculate the exponentially weighted variance (and also standard deviation) with NumPy, the following solution will be useful:

import numpy as np



def ew(a, alpha, winSize):

    _alpha = 1 - alpha

    ws = _alpha ** np.arange(winSize)

    w_sum = ws.sum()

    ew_mean = np.convolve(a, ws)[winSize - 1] / w_sum

    bias = (w_sum ** 2) / ((w_sum ** 2) - (ws ** 2).sum())

    ew_var = (np.convolve((a - ew_mean) ** 2, ws)[winSize - 1] / w_sum) * bias

    ew_std = np.sqrt(ew_var)

    return (ew_mean, ew_var, ew_std)

kosnikkosnik 1,155415 · Accepted Answer · 2018-05-10 14:42:53Z

Building on top of Divakar's great answer, here is an implementation which corresponds to the adjust=True flag of the pandas function, i.e. using weights rather than recursion.

def numpy_ewma(data, window):

    alpha = 2 /(window + 1.0)

    scale = 1/(1-alpha)

    n = data.shape[0]

    scale_arr = (1-alpha)**(-1*np.arange(n))

    weights = (1-alpha)**np.arange(n)

    pw0 = (1-alpha)**(n-1)

    mult = data*pw0*scale_arr

    cumsums = mult.cumsum()

    out = cumsums*scale_arr[::-1] / weights.cumsum()



    return out

score 1 · Accepted Answer · 2018-07-30 04:43:24Z

Thanks to @Divakar's solution and that is really fast. However, it does cause overflow problem which was pointed out by @Danny. The function doesn't return correct answers when the length is greater than 13835 or so at my end.

The following is my solution based on Divakar's solution and pandas.ewm().mean()

def numpy_ema(data, com=None, span=None, halflife=None, alpha=None):

"""Summary

Calculate ema with automatically-generated alpha. Weight of past effect

decreases as the length of window increasing.



# these functions reproduce the pandas result when the flag adjust=False is set.

References:

https://stackoverflow.com/questions/42869495/numpy-version-of-exponential-weighted-moving-average-equivalent-to-pandas-ewm



Args:

    data (TYPE): Description

    com (float, optional): Specify decay in terms of center of mass, alpha=1/(1+com), for com>=0

    span (float, optional): Specify decay in terms of span, alpha=2/(span+1), for span>=1

    halflife (float, optional): Specify decay in terms of half-life, alpha=1-exp(log(0.5)/halflife), for halflife>0

    alpha (float, optional): Specify smoothing factor alpha directly, 0<alpha<=1



Returns:

    TYPE: Description



Raises:

    ValueError: Description

"""

n_input = sum(map(bool, [com, span, halflife, alpha]))

if n_input != 1:

    raise ValueError(

        'com, span, halflife, and alpha are mutually exclusive')



nrow = data.shape[0]

if np.isnan(data).any() or (nrow > 13835) or (data.ndim == 2):

    df = pd.DataFrame(data)

    df_ewm = df.ewm(com=com, span=span, halflife=halflife,

                    alpha=alpha, adjust=False)

    out = df_ewm.mean().values.squeeze()

else:

    if com:

        alpha = 1 / (1 + com)

    elif span:

        alpha = 2 / (span + 1.0)

    elif halflife:

        alpha = 1 - np.exp(np.log(0.5) / halflife)



    alpha_rev = 1 - alpha

    pows = alpha_rev**(np.arange(nrow + 1))



    scale_arr = 1 / pows[:-1]

    offset = data[0] * pows[1:]

    pw0 = alpha * alpha_rev**(nrow - 1)



    mult = data * pw0 * scale_arr



    cumsums = np.cumsum(mult)

    out = offset + cumsums * scale_arr[::-1]

return out

score 1 · Accepted Answer · 2019-02-06 14:27:32Z

Here's my implementation for 1D input arrays with infinite window size. As it uses large numbers, it works only with input arrays with elements of absolute value < 1e16, when using float32, but that should normally be the case.

The idea is to reshape the input array into slices of a limited length, so that no overflow occurs, and then doing the ewm calculation with each slice separately.

def ewm(x, alpha):

    """

    Returns the exponentially weighted mean y of a numpy array x with scaling factor alpha

    y[0] = x[0]

    y[j] = (1. - alpha) * y[j-1] + alpha * x[j],  for j > 0



    x -- 1D numpy array

    alpha -- float

    """



    n = int(-100. / np.log(1.-alpha)) # Makes sure that the first and last elements in f are very big and very small (about 1e22 and 1e-22)

    f = np.exp(np.arange(1-n, n, 2) * (0.5 * np.log(1. - alpha))) # Scaling factor for each slice

    tmp = (np.resize(x, ((len(x) + n - 1) // n, n)) / f * alpha).cumsum(axis=1) * f # Get ewm for each slice of length n



    # Add the last value of each previous slice to the next slice with corresponding scaling factor f and return result

    return np.resize(tmp + np.tensordot(np.append(x[0], np.roll(tmp.T[n-1], 1)[1:]), f * ((1. - alpha) / f[0]), axes=0), len(x))

score 3 · Accepted Answer · 2019-02-19 22:42:14Z

Updated 27/11/2018

WORKING PURE NUMPY, FAST & VECTORIZED SOLUTION FOR LARGE INPUTS

out parameter for in-place computation,
dtype parameter,
index order parameter

@Divakar's answer leads to floating point precision problems when the input is too large. This is because (1-alpha)**(n+1) -> 0 when n -> inf and alpha -> 1, leading to divide-by-zero's and NaN values popping up in the calculation.

Here is my fastest solution with no precision problems, nearly fully vectorized. It's gotten a little complicated but the performance is great, especially for really huge inputs. Without using in-place calculations (which is possible using the out parameter, saving memory allocation time): 3.62 seconds for 100M element input vector, 3.2ms for a 100K element input vector, and 293µs for a 5000 element input vector on a pretty old PC (results will vary with different alpha/row_size values).

# tested with python3 & numpy 1.15.2

import numpy as np



def ewma_vectorized_safe(data, alpha, row_size=None, dtype=None, order='C', out=None):

    """

    Reshapes data before calculating EWMA, then iterates once over the rows

    to calculate the offset without precision issues

    :param data: Input data, will be flattened.

    :param alpha: scalar float in range (0,1)

        The alpha parameter for the moving average.

    :param row_size: int, optional

        The row size to use in the computation. High row sizes need higher precision,

        low values will impact performance. The optimal value depends on the

        platform and the alpha being used. Higher alpha values require lower

        row size. Default depends on dtype.

    :param dtype: optional

        Data type used for calculations. Defaults to float64 unless

        data.dtype is float32, then it will use float32.

    :param order: {'C', 'F', 'A'}, optional

        Order to use when flattening the data. Defaults to 'C'.

    :param out: ndarray, or None, optional

        A location into which the result is stored. If provided, it must have

        the same shape as the desired output. If not provided or `None`,

        a freshly-allocated array is returned.

    :return: The flattened result.

    """

    data = np.array(data, copy=False)



    if dtype is None:

        if data.dtype == np.float32:

            dtype = np.float32

        else:

            dtype = np.float

    else:

        dtype = np.dtype(dtype)



    row_size = int(row_size) if row_size is not None 

               else get_max_row_size(alpha, dtype)



    if data.size <= row_size:

        # The normal function can handle this input, use that

        return ewma_vectorized(data, alpha, dtype=dtype, order=order, out=out)



    if data.ndim > 1:

        # flatten input

        data = np.reshape(data, -1, order=order)



    if out is None:

        out = np.empty_like(data, dtype=dtype)

    else:

        assert out.shape == data.shape

        assert out.dtype == dtype



    row_n = int(data.size // row_size)  # the number of rows to use

    trailing_n = int(data.size % row_size)  # the amount of data leftover

    first_offset = data[0]



    if trailing_n > 0:

        # set temporary results to slice view of out parameter

        out_main_view = np.reshape(out[:-trailing_n], (row_n, row_size))

        data_main_view = np.reshape(data[:-trailing_n], (row_n, row_size))

    else:

        out_main_view = out

        data_main_view = data



    # get all the scaled cumulative sums with 0 offset

    ewma_vectorized_2d(data_main_view, alpha, axis=1, offset=0, dtype=dtype,

                       order='C', out=out_main_view)



    scaling_factors = (1 - alpha) ** np.arange(1, row_size + 1)

    last_scaling_factor = scaling_factors[-1]



    # create offset array

    offsets = np.empty(out_main_view.shape[0], dtype=dtype)

    offsets[0] = first_offset

    # iteratively calculate offset for each row

    for i in range(1, out_main_view.shape[0]):

        offsets[i] = offsets[i - 1] * last_scaling_factor + out_main_view[i - 1, -1]



    # add the offsets to the result

    out_main_view += offsets[:, np.newaxis] * scaling_factors[np.newaxis, :]



    if trailing_n > 0:

        # process trailing data in the 2nd slice of the out parameter

        ewma_vectorized(data[-trailing_n:], alpha, offset=out_main_view[-1, -1],

                        dtype=dtype, order='C', out=out[-trailing_n:])

    return out



def get_max_row_size(alpha, dtype=float):

    assert 0. <= alpha < 1.

    # This will return the maximum row size possible on 

    # your platform for the given dtype. I can find no impact on accuracy

    # at this value on my machine.

    # Might not be the optimal value for speed, which is hard to predict

    # due to numpy's optimizations

    # Use np.finfo(dtype).eps if you  are worried about accuracy

    # and want to be extra safe.

    epsilon = np.finfo(dtype).tiny

    # If this produces an OverflowError, make epsilon larger

    return int(np.log(epsilon)/np.log(1-alpha)) + 1

The 1D ewma function:

def ewma_vectorized(data, alpha, offset=None, dtype=None, order='C', out=None):

    """

    Calculates the exponential moving average over a vector.

    Will fail for large inputs.

    :param data: Input data

    :param alpha: scalar float in range (0,1)

        The alpha parameter for the moving average.

    :param offset: optional

        The offset for the moving average, scalar. Defaults to data[0].

    :param dtype: optional

        Data type used for calculations. Defaults to float64 unless

        data.dtype is float32, then it will use float32.

    :param order: {'C', 'F', 'A'}, optional

        Order to use when flattening the data. Defaults to 'C'.

    :param out: ndarray, or None, optional

        A location into which the result is stored. If provided, it must have

        the same shape as the input. If not provided or `None`,

        a freshly-allocated array is returned.

    """

    data = np.array(data, copy=False)



    if dtype is None:

        if data.dtype == np.float32:

            dtype = np.float32

        else:

            dtype = np.float64

    else:

        dtype = np.dtype(dtype)



    if data.ndim > 1:

        # flatten input

        data = data.reshape(-1, order)



    if out is None:

        out = np.empty_like(data, dtype=dtype)

    else:

        assert out.shape == data.shape

        assert out.dtype == dtype



    if data.size < 1:

        # empty input, return empty array

        return out



    if offset is None:

        offset = data[0]



    alpha = np.array(alpha, copy=False).astype(dtype, copy=False)



    # scaling_factors -> 0 as len(data) gets large

    # this leads to divide-by-zeros below

    scaling_factors = np.power(1. - alpha, np.arange(row_size + 1, dtype=dtype),

                               dtype=dtype)

    # create cumulative sum array

    np.multiply(data, (alpha * scaling_factors[-2]) / scaling_factors[:-1],

                dtype=dtype, out=out)

    np.cumsum(out, dtype=dtype, out=out)



    # cumsums / scaling

    out /= scaling_factors[-2::-1]



    if offset != 0:

        offset = np.array(offset, copy=False).astype(dtype, copy=False)

        # add offsets

        out += offset * scaling_factors[1:]



    return out

The 2D ewma function:

def ewma_vectorized_2d(data, alpha, axis=None, offset=None, dtype=None, order='C', out=None):

    """

    Calculates the exponential moving average over a given axis.

    :param data: Input data, must be 1D or 2D array.

    :param alpha: scalar float in range (0,1)

        The alpha parameter for the moving average.

    :param axis: The axis to apply the moving average on.

        If axis==None, the data is flattened.

    :param offset: optional

        The offset for the moving average. Must be scalar or a

        vector with one element for each row of data. If set to None,

        defaults to the first value of each row.

    :param dtype: optional

        Data type used for calculations. Defaults to float64 unless

        data.dtype is float32, then it will use float32.

    :param order: {'C', 'F', 'A'}, optional

        Order to use when flattening the data. Ignored if axis is not None.

    :param out: ndarray, or None, optional

        A location into which the result is stored. If provided, it must have

        the same shape as the desired output. If not provided or `None`,

        a freshly-allocated array is returned.

    """

    data = np.array(data, copy=False)



    assert data.ndim <= 2



    if dtype is None:

        if data.dtype == np.float32:

            dtype = np.float32

        else:

            dtype = np.float64

    else:

        dtype = np.dtype(dtype)



    if out is None:

        out = np.empty_like(data, dtype=dtype)

    else:

        assert out.shape == data.shape

        assert out.dtype == dtype



    if data.size < 1:

        # empty input, return empty array

        return out



    if axis is None or data.ndim < 2:

        # use 1D version

        if isinstance(offset, np.ndarray):

            offset = offset[0]

        return ewma_vectorized(data, alpha, offset, dtype=dtype, order=order,

                               out=out)



    assert -data.ndim <= axis < data.ndim



    # create reshaped data views

    out_view = out

    if axis < 0:

        axis = data.ndim - int(axis)



    if axis == 0:

        # transpose data views so columns are treated as rows

        data = data.T

        out_view = out_view.T



    if offset is None:

        # use the first element of each row as the offset

        offset = np.copy(data[:, 0])

    elif np.size(offset) == 1:

        offset = np.reshape(offset, (1,))



    alpha = np.array(alpha, copy=False).astype(dtype, copy=False)



    # calculate the moving average

    row_size = data.shape[1]

    row_n = data.shape[0]

    scaling_factors = np.power(1. - alpha, np.arange(row_size + 1, dtype=dtype),

                               dtype=dtype)

    # create a scaled cumulative sum array

    np.multiply(

        data,

        np.multiply(alpha * scaling_factors[-2], np.ones((row_n, 1), dtype=dtype),

                    dtype=dtype)

        / scaling_factors[np.newaxis, :-1],

        dtype=dtype, out=out_view

    )

    np.cumsum(out_view, axis=1, dtype=dtype, out=out_view)

    out_view /= scaling_factors[np.newaxis, -2::-1]



    if not (np.size(offset) == 1 and offset == 0):

        offset = offset.astype(dtype, copy=False)

        # add the offsets to the scaled cumulative sums

        out_view += offset[:, np.newaxis] * scaling_factors[np.newaxis, 1:]



    return out

usage:

data_n = 100000000

data = ((0.5*np.random.randn(data_n)+0.5) % 1) * 100

window = 5000

sum_proportion = .875

alpha = 1 - np.exp(np.log(1 - sum_proportion) / window)  # or 2/(window+1) for panda's span function

result = ewma_vectorized_safe(data, alpha)

Just a tip

It is easy to calculate a 'window size' (technically exponential averages have infinite 'windows') for a given alpha, dependent on the contribution of the data in that window to the average. This is useful for example to chose how much of the start of the result to treat as unreliable due to border effects.

sum_proportion = .99  # window covers 99% of contribution to the moving average

scale_factors = (1 - alpha) ** (np.arange(data.shape[0] + 1))

scale_factor_cumsum = np.cumsum(scale_factors)

# window_size is the index of the first partial sum of scale_factors

# where partial_sum > sum_proportion * total_sum.

# Increases with increased sum_proportion and decreased alpha

window_size = np.argmax(scale_factor_cumsum > sum_proportion * scale_factor_cumsum[-1])

or more math-y but efficiently:

def window_size(alpha, sum_proportion):

    # solve (1-alpha)**window_size = (1-sum_proportion) for window_size        

    return int(np.log(1-sum_proportion) / np.log(1-alpha))

The alpha = 2 / (window_size + 1.0) relation used in this thread (the 'span' option from pandas) is a very rough approximation of the inverse of the above function (with sum_proportion~=0.87). alpha = 1 - np.exp(np.log(1-sum_proportion)/window_size) is more accurate (the 'half-life' option from pandas equals this formula with sum_proportion=0.5).

In the following example, data represents a continuous noisy signal. cutoff_idx is the first position in result where at least 99% of the value is dependent on separate values in data (i.e. less than 1% depends on data[0]). The data up to cutoff_idx is excluded from the final results because it is too dependent on the first value in data, therefore possibly skewing the average.

result = ewma_vectorized_safe(data, alpha, chunk_size)

sum_proportion = .99

cutoff_idx = window_size(alpha, sum_proportion)

result = result[cutoff_idx:]

To illustrate the problem the above solve you can run this a few times, notice the often-appearing false start of the red line, which is skipped after cutoff_idx:

data_n = 100000

data = np.random.rand(data_n) * 100

window = 1000

chunk_size = 10000

sum_proportion = .99

alpha = 1 - np.exp(np.log(1-sum_proportion)/window)



result = ewma_vectorized_safe(data, alpha, chunk_size)



cutoff_idx = window_size(alpha, sum_proportion)

x = np.arange(start=0, stop=result.size)



import matplotlib.pyplot as plt

plt.plot(x[:cutoff_idx+1], result[:cutoff_idx+1], '-r',

         x[cutoff_idx:], result[cutoff_idx:], '-b')

plt.show()

note that cutoff_idx==window because alpha was set with the inverse of the window_size() function, with the same sum_proportion.

Peter Mortensen 13.7k1986113 · Accepted Answer · 2018-11-17 15:24:04Z

I think I have finally cracked it!

Here's a vectorized version of numpy_ewma function that's claimed to be producing the correct results from @RaduS's post -

def numpy_ewma_vectorized(data, window):



    alpha = 2 /(window + 1.0)

    alpha_rev = 1-alpha



    scale = 1/alpha_rev

    n = data.shape[0]



    r = np.arange(n)

    scale_arr = scale**r

    offset = data[0]*alpha_rev**(r+1)

    pw0 = alpha*alpha_rev**(n-1)



    mult = data*pw0*scale_arr

    cumsums = mult.cumsum()

    out = offset + cumsums*scale_arr[::-1]

    return out

Further boost

We can boost it further with some code re-use, like so -

def numpy_ewma_vectorized_v2(data, window):



    alpha = 2 /(window + 1.0)

    alpha_rev = 1-alpha

    n = data.shape[0]



    pows = alpha_rev**(np.arange(n+1))



    scale_arr = 1/pows[:-1]

    offset = data[0]*pows[1:]

    pw0 = alpha*alpha_rev**(n-1)



    mult = data*pw0*scale_arr

    cumsums = mult.cumsum()

    out = offset + cumsums*scale_arr[::-1]

    return out

Runtime test

Let's time these two against the same loopy function for a big dataset.

In [97]: data = np.random.randint(2,9,(5000))

    ...: window = 20

    ...:



In [98]: np.allclose(numpy_ewma(data, window), numpy_ewma_vectorized(data, window))

Out[98]: True



In [99]: np.allclose(numpy_ewma(data, window), numpy_ewma_vectorized_v2(data, window))

Out[99]: True



In [100]: %timeit numpy_ewma(data, window)

100 loops, best of 3: 6.03 ms per loop



In [101]: %timeit numpy_ewma_vectorized(data, window)

1000 loops, best of 3: 665 µs per loop



In [102]: %timeit numpy_ewma_vectorized_v2(data, window)

1000 loops, best of 3: 357 µs per loop



In [103]: 6030/357.0

Out[103]: 16.89075630252101

There is around a 17 times speedup!

that's it :) This one is both super fast and correct. Thank you again @Divakar. Is it allowed to change the right answer on stackoverflow? as well as i would like to award this answer with the bounty as it meets my initial requirements of beating pandas performance — Mar 21 '17 at 12:40
@RaduS Yes, changing the accept on answers to some other answer is allowed, if you are getting a better one :) — Mar 21 '17 at 12:42
@RaduS Appreciate the bounty! Was an interesting and quite engrossing problem this one, given the inherent recursion in your loopy version, but was worth the fun! :) — Mar 21 '17 at 12:45
This code doesn't work if the input vector is too large. For example, ewma( np.ones(10000), span=10) gives all nans. This occurs because at some point pows becomes all zeros and scale_arr becomes all nans. — Sep 19 '18 at 22:13

Peter Mortensen 13.7k1986113 · Accepted Answer · 2018-11-17 15:26:03Z

Here is an implementation using NumPy that is equivalent to using df.ewm(alpha=alpha).mean(). After reading the documentation, it is just a few matrix operations. The trick is constructing the right matrices.

It is worth noting that because we are creating float matrices, you can quickly eat through your memory if the input array is too large.

import pandas as pd

import numpy as np



def ewma(x, alpha):

    '''

    Returns the exponentially weighted moving average of x.



    Parameters:

    -----------

    x : array-like

    alpha : float {0 <= alpha <= 1}



    Returns:

    --------

    ewma: numpy array

          the exponentially weighted moving average

    '''

    # Coerce x to an array

    x = np.array(x)

    n = x.size



    # Create an initial weight matrix of (1-alpha), and a matrix of powers

    # to raise the weights by

    w0 = np.ones(shape=(n,n)) * (1-alpha)

    p = np.vstack([np.arange(i,i-n,-1) for i in range(n)])



    # Create the weight matrix

    w = np.tril(w0**p,0)



    # Calculate the ewma

    return np.dot(w, x[::np.newaxis]) / w.sum(axis=1)

Let's test its:

alpha = 0.55

x = np.random.randint(0,30,15)

df = pd.DataFrame(x, columns=['A'])

df.ewm(alpha=alpha).mean()



# returns:

#             A

# 0   13.000000

# 1   22.655172

# 2   20.443268

# 3   12.159796

# 4   14.871955

# 5   15.497575

# 6   20.743511

# 7   20.884818

# 8   24.250715

# 9   18.610901

# 10  17.174686

# 11  16.528564

# 12  17.337879

# 13   7.801912

# 14  12.310889



ewma(x=x, alpha=alpha)



# returns:

# array([ 13.        ,  22.65517241,  20.44326778,  12.1597964 ,

#        14.87195534,  15.4975749 ,  20.74351117,  20.88481763,

#        24.25071484,  18.61090129,  17.17468551,  16.52856393,

#        17.33787888,   7.80191235,  12.31088889])

One more thing you can change here is that instead of providing the alpha, you can write alpha=2/(winsowSize+1) at the beginning of the function and then provide the windowSize instead just like in pandas function — Mar 20 '17 at 17:22
is there any other way to calculate p = np.vstack([np.arange(i,i-n,-1) for i in range(n)]) by avoiding the np.vstack looping and the w = np.tril(w0**p,0)? This np.vstack looping and np.tril function seems to be making the entire function perform extremely slow when compared to pandas solution — Mar 20 '17 at 18:05

score 7 · Accepted Answer · 2018-11-27 14:05:30Z

Fastest EWMA 23x `pandas`

The question is strictly asking for a numpy solution, however, it seems that the OP was actually just after a pure numpy solution to speed up runtime.

I solved a similar problem but instead looked towards numba.jit which massively speeds the compute time

In [24]: a = np.random.random(10**7)

    ...: df = pd.Series(a)

In [25]: %timeit numpy_ewma(a, 10)               # /a/42915307/4013571

    ...: %timeit df.ewm(span=10).mean()          # pandas

    ...: %timeit numpy_ewma_vectorized_v2(a, 10) # best w/o numba: /a/42926270/4013571

    ...: %timeit _ewma(a, 10)                    # fastest accurate (below)

    ...: %timeit _ewma_infinite_hist(a, 10)      # fastest overall (below)

4.14 s ± 116 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

991 ms ± 52.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 

396 ms ± 8.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

181 ms ± 1.01 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)   

39.6 ms ± 979 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Scaling down to smaller arrays of a = np.random.random(100) (results in the same order)

41.6 µs ± 491 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

945 ms ± 12 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

16 µs ± 93.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

1.66 µs ± 13.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

1.14 µs ± 5.57 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

It is also worth pointing out that my functions below are identically aligned to the pandas (see the examples in docstr), whereas a few of the answers here take various different approximations. For example,

In [57]: print(pd.DataFrame([1,2,3]).ewm(span=2).mean().values.ravel())

    ...: print(numpy_ewma_vectorized_v2(np.array([1,2,3]), 2))

    ...: print(numpy_ewma(np.array([1,2,3]), 2))

[1.         1.75       2.61538462]

[1.         1.66666667 2.55555556]

[1.         1.18181818 1.51239669]

The source code which I have documented for my own library

import numpy as np

from numba import jit

from numba import float64

from numba import int64





@jit((float64[:], int64), nopython=True, nogil=True)

def _ewma(arr_in, window):

    r"""Exponentialy weighted moving average specified by a decay ``window``

    to provide better adjustments for small windows via:



        y[t] = (x[t] + (1-a)*x[t-1] + (1-a)^2*x[t-2] + ... + (1-a)^n*x[t-n]) /

               (1 + (1-a) + (1-a)^2 + ... + (1-a)^n).



    Parameters

    ----------

    arr_in : np.ndarray, float64

        A single dimenisional numpy array

    window : int64

        The decay window, or 'span'



    Returns

    -------

    np.ndarray

        The EWMA vector, same length / shape as ``arr_in``



    Examples

    --------

    >>> import pandas as pd

    >>> a = np.arange(5, dtype=float)

    >>> exp = pd.DataFrame(a).ewm(span=10, adjust=True).mean()

    >>> np.array_equal(_ewma_infinite_hist(a, 10), exp.values.ravel())

    True

    """

    n = arr_in.shape[0]

    ewma = np.empty(n, dtype=float64)

    alpha = 2 / float(window + 1)

    w = 1

    ewma_old = arr_in[0]

    ewma[0] = ewma_old

    for i in range(1, n):

        w += (1-alpha)**i

        ewma_old = ewma_old*(1-alpha) + arr_in[i]

        ewma[i] = ewma_old / w

    return ewma





@jit((float64[:], int64), nopython=True, nogil=True)

def _ewma_infinite_hist(arr_in, window):

    r"""Exponentialy weighted moving average specified by a decay ``window``

    assuming infinite history via the recursive form:



        (2) (i)  y[0] = x[0]; and

            (ii) y[t] = a*x[t] + (1-a)*y[t-1] for t>0.



    This method is less accurate that ``_ewma`` but

    much faster:



        In [1]: import numpy as np, bars

           ...: arr = np.random.random(100000)

           ...: %timeit bars._ewma(arr, 10)

           ...: %timeit bars._ewma_infinite_hist(arr, 10)

        3.74 ms ± 60.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

        262 µs ± 1.54 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)



    Parameters

    ----------

    arr_in : np.ndarray, float64

        A single dimenisional numpy array

    window : int64

        The decay window, or 'span'



    Returns

    -------

    np.ndarray

        The EWMA vector, same length / shape as ``arr_in``



    Examples

    --------

    >>> import pandas as pd

    >>> a = np.arange(5, dtype=float)

    >>> exp = pd.DataFrame(a).ewm(span=10, adjust=False).mean()

    >>> np.array_equal(_ewma_infinite_hist(a, 10), exp.values.ravel())

    True

    """

    n = arr_in.shape[0]

    ewma = np.empty(n, dtype=float64)

    alpha = 2 / float(window + 1)

    ewma[0] = arr_in[0]

    for i in range(1, n):

        ewma[i] = arr_in[i] * alpha + ewma[i-1] * (1 - alpha)

    return ewma

score 6 · Accepted Answer · 2017-03-20 20:20:48Z

Given alpha and windowSize, here's an approach to simulate the corresponding behavior on NumPy -

def numpy_ewm_alpha(a, alpha, windowSize):

    wghts = (1-alpha)**np.arange(windowSize)

    wghts /= wghts.sum()

    out = np.full(df.shape[0],np.nan)

    out[windowSize-1:] = np.convolve(a,wghts,'valid')

    return out

Sample runs for verification -

In [54]: alpha = 0.55

    ...: windowSize = 20

    ...: 



In [55]: df = pd.DataFrame(np.random.randint(2,9,(100)))



In [56]: out0 = df.ewm(alpha = alpha, min_periods=windowSize).mean().as_matrix().ravel()

    ...: out1 = numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)

    ...: print "Max. error : " + str(np.nanmax(np.abs(out0 - out1)))

    ...: 

Max. error : 5.10531254605e-07



In [57]: alpha = 0.75

    ...: windowSize = 30

    ...: 



In [58]: out0 = df.ewm(alpha = alpha, min_periods=windowSize).mean().as_matrix().ravel()

    ...: out1 = numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)

    ...: print "Max. error : " + str(np.nanmax(np.abs(out0 - out1)))



Max. error : 8.881784197e-16

Runtime test on bigger dataset -

In [61]: alpha = 0.55

    ...: windowSize = 20

    ...: 



In [62]: df = pd.DataFrame(np.random.randint(2,9,(10000)))



In [63]: %timeit df.ewm(alpha = alpha, min_periods=windowSize).mean()

1000 loops, best of 3: 851 µs per loop



In [64]: %timeit numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)

1000 loops, best of 3: 204 µs per loop

Further boost

For further performance boost we could avoid the initialization with NaNs and instead use the array outputted from np.convolve, like so -

def numpy_ewm_alpha_v2(a, alpha, windowSize):

    wghts = (1-alpha)**np.arange(windowSize)

    wghts /= wghts.sum()

    out = np.convolve(a,wghts)

    out[:windowSize-1] = np.nan

    return out[:a.size]

Timings -

In [117]: alpha = 0.55

     ...: windowSize = 20

     ...: 



In [118]: df = pd.DataFrame(np.random.randint(2,9,(10000)))



In [119]: %timeit numpy_ewm_alpha(df.values.ravel(), alpha = alpha, windowSize = windowSize)

1000 loops, best of 3: 204 µs per loop



In [120]: %timeit numpy_ewm_alpha_v2(df.values.ravel(), alpha = alpha, windowSize = windowSize)

10000 loops, best of 3: 195 µs per loop

the speed is indeed incredible, beating pretty much all solutions i have up until now. I can live with the small error though it could generate significant error results down the pipeline. I will look into it a bit more deeply. I have another solution i came up with and will post it as an asnwer. It doesn't beat your answer but the speed is much better than pandas and does not have such a big error — Mar 20 '17 at 22:31
I do not understand completely why is there an error margin? is because of the different computing methods? — Mar 20 '17 at 22:33
@RaduS Are you referring to the Max. error : that I am showing in my tests? — Mar 20 '17 at 22:35
yes. i took and plotted the moving averages from your solution, from pandas and from Exponential Moving Average Formula There is this slight difference between your solution and the other which is even visible on the graph that i do not completely understand why. — Mar 20 '17 at 22:40
@RaduS Hmm how big is that error, as in could you get us the max. abs. differentiation value between my solution and the formula based one? — Mar 20 '17 at 22:45

Peter Mortensen 13.7k1986113 · Accepted Answer · 2018-11-17 15:27:47Z

Here is another solution O came up with in the meantime. It is about four times faster than the pandas solution.

def numpy_ewma(data, window):

    returnArray = np.empty((data.shape[0]))

    returnArray.fill(np.nan)

    e = data[0]

    alpha = 2 / float(window + 1)

    for s in range(data.shape[0]):

        e =  ((data[s]-e) *alpha ) + e

        returnArray[s] = e

    return returnArray

I used this formula as a starting point. I am sure that this can be improved even more, but it is at least a starting point.

DannyDanny 312 · Accepted Answer · 2017-07-26 09:13:53Z

@Divakar's answer seems to cause overflow when dealing with

numpy_ewma_vectorized(np.random.random(500000), 10)

What I have been using is:

def EMA(input, time_period=10): # For time period = 10

    t_ = time_period - 1

    ema = np.zeros_like(input,dtype=float)

    multiplier = 2.0 / (time_period + 1)

    #multiplier = 1 - multiplier

    for i in range(len(input)):

        # Special Case

        if i > t_:

            ema[i] = (input[i] - ema[i-1]) * multiplier + ema[i-1]

        else:

            ema[i] = np.mean(input[:i+1])

    return ema

However, this is way slower than the panda solution:

from pandas import ewma as pd_ema

def EMA_fast(X, time_period = 10):

    out = pd_ema(X, span=time_period, min_periods=time_period)

    out[:time_period-1] = np.cumsum(X[:time_period-1]) / np.asarray(range(1,time_period))

    return out

Peter Mortensen 13.7k1986113 · Accepted Answer · 2018-11-17 15:28:53Z

This answer may seem irrelevant. But, for those who also need to calculate the exponentially weighted variance (and also standard deviation) with NumPy, the following solution will be useful:

import numpy as np



def ew(a, alpha, winSize):

    _alpha = 1 - alpha

    ws = _alpha ** np.arange(winSize)

    w_sum = ws.sum()

    ew_mean = np.convolve(a, ws)[winSize - 1] / w_sum

    bias = (w_sum ** 2) / ((w_sum ** 2) - (ws ** 2).sum())

    ew_var = (np.convolve((a - ew_mean) ** 2, ws)[winSize - 1] / w_sum) * bias

    ew_std = np.sqrt(ew_var)

    return (ew_mean, ew_var, ew_std)

kosnikkosnik 1,155415 · Accepted Answer · 2018-05-10 14:42:53Z

Building on top of Divakar's great answer, here is an implementation which corresponds to the adjust=True flag of the pandas function, i.e. using weights rather than recursion.

def numpy_ewma(data, window):

    alpha = 2 /(window + 1.0)

    scale = 1/(1-alpha)

    n = data.shape[0]

    scale_arr = (1-alpha)**(-1*np.arange(n))

    weights = (1-alpha)**np.arange(n)

    pw0 = (1-alpha)**(n-1)

    mult = data*pw0*scale_arr

    cumsums = mult.cumsum()

    out = cumsums*scale_arr[::-1] / weights.cumsum()



    return out

score 1 · Accepted Answer · 2018-07-30 04:43:24Z

Thanks to @Divakar's solution and that is really fast. However, it does cause overflow problem which was pointed out by @Danny. The function doesn't return correct answers when the length is greater than 13835 or so at my end.

The following is my solution based on Divakar's solution and pandas.ewm().mean()

def numpy_ema(data, com=None, span=None, halflife=None, alpha=None):

"""Summary

Calculate ema with automatically-generated alpha. Weight of past effect

decreases as the length of window increasing.



# these functions reproduce the pandas result when the flag adjust=False is set.

References:

https://stackoverflow.com/questions/42869495/numpy-version-of-exponential-weighted-moving-average-equivalent-to-pandas-ewm



Args:

    data (TYPE): Description

    com (float, optional): Specify decay in terms of center of mass, alpha=1/(1+com), for com>=0

    span (float, optional): Specify decay in terms of span, alpha=2/(span+1), for span>=1

    halflife (float, optional): Specify decay in terms of half-life, alpha=1-exp(log(0.5)/halflife), for halflife>0

    alpha (float, optional): Specify smoothing factor alpha directly, 0<alpha<=1



Returns:

    TYPE: Description



Raises:

    ValueError: Description

"""

n_input = sum(map(bool, [com, span, halflife, alpha]))

if n_input != 1:

    raise ValueError(

        'com, span, halflife, and alpha are mutually exclusive')



nrow = data.shape[0]

if np.isnan(data).any() or (nrow > 13835) or (data.ndim == 2):

    df = pd.DataFrame(data)

    df_ewm = df.ewm(com=com, span=span, halflife=halflife,

                    alpha=alpha, adjust=False)

    out = df_ewm.mean().values.squeeze()

else:

    if com:

        alpha = 1 / (1 + com)

    elif span:

        alpha = 2 / (span + 1.0)

    elif halflife:

        alpha = 1 - np.exp(np.log(0.5) / halflife)



    alpha_rev = 1 - alpha

    pows = alpha_rev**(np.arange(nrow + 1))



    scale_arr = 1 / pows[:-1]

    offset = data[0] * pows[1:]

    pw0 = alpha * alpha_rev**(nrow - 1)



    mult = data * pw0 * scale_arr



    cumsums = np.cumsum(mult)

    out = offset + cumsums * scale_arr[::-1]

return out

score 1 · Accepted Answer · 2019-02-06 14:27:32Z

Here's my implementation for 1D input arrays with infinite window size. As it uses large numbers, it works only with input arrays with elements of absolute value < 1e16, when using float32, but that should normally be the case.

The idea is to reshape the input array into slices of a limited length, so that no overflow occurs, and then doing the ewm calculation with each slice separately.

def ewm(x, alpha):

    """

    Returns the exponentially weighted mean y of a numpy array x with scaling factor alpha

    y[0] = x[0]

    y[j] = (1. - alpha) * y[j-1] + alpha * x[j],  for j > 0



    x -- 1D numpy array

    alpha -- float

    """



    n = int(-100. / np.log(1.-alpha)) # Makes sure that the first and last elements in f are very big and very small (about 1e22 and 1e-22)

    f = np.exp(np.arange(1-n, n, 2) * (0.5 * np.log(1. - alpha))) # Scaling factor for each slice

    tmp = (np.resize(x, ((len(x) + n - 1) // n, n)) / f * alpha).cumsum(axis=1) * f # Get ewm for each slice of length n



    # Add the last value of each previous slice to the next slice with corresponding scaling factor f and return result

    return np.resize(tmp + np.tensordot(np.append(x[0], np.roll(tmp.T[n-1], 1)[1:]), f * ((1. - alpha) / f[0]), axes=0), len(x))

搜尋此網誌

Ndtyjky

NumPy version of “Exponential weighted moving average”, equivalent to pandas.ewm().mean()

11 Answers
11

Fastest EWMA 23x `pandas`

Your Answer

Post as a guest

11 Answers
11

11 Answers
11

Fastest EWMA 23x `pandas`

Fastest EWMA 23x `pandas`

Fastest EWMA 23x `pandas`

Fastest EWMA 23x `pandas`

Post as a guest

Popular posts from this blog

The Sandy Post

Danny Elfman

Pages that link to "Head v. Amoskeag Manufacturing Co."

NumPy version of “Exponential weighted moving average”, equivalent to pandas.ewm().mean()

11 Answers 11

Fastest EWMA 23x pandas

Your Answer

Sign up or log in

Post as a guest

Post as a guest

11 Answers 11

11 Answers 11

Fastest EWMA 23x pandas

Fastest EWMA 23x pandas

Fastest EWMA 23x pandas

Fastest EWMA 23x pandas

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

The Sandy Post

Danny Elfman

Pages that link to "Head v. Amoskeag Manufacturing Co."

11 Answers
11

Fastest EWMA 23x `pandas`

11 Answers
11

11 Answers
11

Fastest EWMA 23x `pandas`

Fastest EWMA 23x `pandas`

Fastest EWMA 23x `pandas`

Fastest EWMA 23x `pandas`