NumPy giảm trên lát không tiếp giáp liên tiếp

Nói rằng tôi có hai mảng NumPy, A hình dạng (d, f) và I hình dạng (d,) chỉ số chứa trong 0..n, ví dụNumPy giảm trên lát không tiếp giáp liên tiếp

I = np.array([0, 0, 1, 0, 2, 1]) 
A = np.arange(12).reshape(6, 2)

Tôi đang tìm kiếm một cách nhanh chóng để làm giảm, đặc biệt sum, mean và max, trên tất cả các lát A[I == i, :]; một phiên bản chậm sẽ

results = np.zeros((I.max() + 1, A.shape[1])) 
for i in np.unique(I): 
    results[i, :] = np.mean(A[I == i, :], axis=0)

mang đến cho trong trường hợp này

results = [[ 2.66666667, 3.66666667], 
      [ 7.  , 8.  ], 
      [ 8.  , 9.  ]])

EDIT: Tôi đã làm một số timings dựa trên câu trả lời Divakar và (xóa) pandas câu trả lời dựa trên một poster trước .

đang Thời gian:

from __future__ import division, print_function 
import numpy as np, pandas as pd 
from time import time 

np.random.seed(0) 
d = 500000 
f = 500 
n = 500 
I = np.hstack((np.arange(n), np.random.randint(n, size=(d - n,)))) 
np.random.shuffle(I) 
A = np.random.rand(d, f) 

def reduce_naive(A, I, op="avg"): 
    target_dtype = (np.float if op=="avg" else A.dtype) 
    results = np.zeros((I.max() + 1, A.shape[1]), dtype=target_dtype) 
    npop = {"avg": np.mean, "sum": np.sum, "max": np.max}.get(op) 
    for i in np.unique(I): 
     results[i, :] = npop(A[I == i, :], axis=0) 
    return results 

def reduce_reduceat(A, I, op="avg"): 
    sidx = I.argsort() 
    sI = I[sidx] 
    sortedA = A[sidx] 
    idx = np.r_[ 0, np.flatnonzero(sI[1:] != sI[:-1])+1 ] 
    if op == "max": 
     return np.maximum.reduceat(sortedA, idx, axis=0) 
    sums = np.add.reduceat(sortedA, idx, axis=0) 
    if op == "sum": 
     return sums 
    if op == "avg": 
     count = np.r_[idx[1:] - idx[:-1], A.shape[0] - idx[-1]] 
     return sums/count.astype(float)[:,None] 

def reduce_bincount(A, I, op="avg"): 
    ids = (I[:,None] + (I.max()+1)*np.arange(A.shape[1])).ravel() 
    sums = np.bincount(ids, A.ravel()).reshape(A.shape[1],-1).T 
    if op == "sum": 
     return sums 
    if op == "avg": 
     return sums/np.bincount(ids).reshape(A.shape[1],-1).T 

def reduce_pandas(A, I, op="avg"): 
    group = pd.concat([pd.DataFrame(A), pd.DataFrame(I, columns=("i",)) 
        ], axis=1 
        ).groupby('i') 
    if op == "sum": 
     return group.sum().values 
    if op == "avg": 
     return group.mean().values 
    if op == "max": 
     return group.max().values 

def reduce_hybrid(A, I, op="avg"): 
    sidx = I.argsort() 
    sI = I[sidx] 
    sortedA = A[sidx] 

    idx = np.r_[ 0, np.flatnonzero(sI[1:] != sI[:-1])+1 ] 
    unq_sI = sI[idx]  

    m = I.max()+1 
    N = A.shape[1] 

    target_dtype = (np.float if op=="avg" else A.dtype) 
    out = np.zeros((m,N),dtype=target_dtype) 
    ss_idx = np.r_[idx,A.shape[0]] 

    npop = {"avg": np.mean, "sum": np.sum, "max": np.max}.get(op) 
    for i in range(len(idx)): 
     out[unq_sI[i]] = npop(sortedA[ss_idx[i]:ss_idx[i+1]], axis=0) 
    return out 

for op in ("sum", "avg", "max"): 
    for name, method in (("naive ", reduce_naive), 
         ("reduceat", reduce_reduceat), 
         ("pandas ", reduce_pandas), 
         ("bincount", reduce_bincount), 
         ("hybrid ", reduce_hybrid) 
         ("numba ", reduce_numba) 
         ):  
     if op == "max" and name == "bincount": 
      continue 
     # if name is not "naive": 
     #  assert np.allclose(method(A, I, op), reduce_naive(A, I, op)) 
     times = [] 
     for tries in range(3): 
      time0 = time(); method(A, I, op) 
      times.append(time() - time0); 
     print(name, op, "{:.2f}".format(np.min(times))) 
    print()

Thời gian:

naive sum 1.10 
reduceat sum 4.62 
pandas sum 5.29 
bincount sum 1.54 
hybrid sum 0.62 
numba sum 0.31 

naive avg 1.12 
reduceat avg 4.45 
pandas avg 5.23 
bincount avg 2.43 
hybrid avg 0.61 
numba avg 0.33 

naive max 1.19 
reduceat max 3.18 
pandas max 5.24 
hybrid max 0.72 
numba max 0.34

(tôi đã chọn d và n giá trị như điển hình cho trường hợp sử dụng của tôi - Tôi đã thêm mã cho numba-phiên bản trong câu trả lời của tôi).

Nguồn

2017-03-17 Maxim

Thời gian 'pandas' của bạn thấp (so với các thử nghiệm của tôi), nhưng đó là vì bạn đã thực hiện việc tạo khung dữ liệu và' nhóm ra khỏi vòng lặp thời gian. – hpaulj

@hpaulj lưu ý rằng 'groupby' nằm bên trong hàm, vì vậy bên trong vòng lặp thời gian. Tôi đã cập nhật thời gian, cố định hạt giống và lấy tốt nhất 3 để tái tạo tốt hơn. Tôi có gấu trúc '0.19.2'. – Maxim

Tôi chấp nhận giải pháp của Divakar như một cách tiếp cận thuần túy, nhưng điều này cho thấy mã được biên dịch bằng numba hoặc cython nên đưa ra các giải pháp nhanh nhất ở đây. – Maxim

Approach # 1: Sử dụng NumPy ufunc reduceat

Chúng tôi có ufuncs cho những ba hoạt động giảm và may mắn là chúng tôi cũng có ufunc.reduceat để thực hiện những cắt giảm giới hạn trong khoảng thời gian đặc biệt cùng một trục. Vì vậy, sử dụng đó, chúng tôi sẽ có những ba hoạt động tính toán như vậy -

# Gives us sorted array based on input indices I and indices at which the 
# sorted array should be interval-limited for reduceat operations to be 
# applied later on using those results 
def sorted_array_intervals(A, I): 
    # Compute sort indices for I. To be later used for sorting A based on it. 
    sidx = I.argsort() 
    sI = I[sidx] 
    sortedA = A[sidx] 

    # Get indices at which intervals change. Also, get count in each interval 
    idx = np.r_[ 0, np.flatnonzero(sI[1:] != sI[:-1])+1 ] 
    return sortedA, idx 

# Groupby sum reduction using the interval indices 
# to perform interval-limited ufunc reductions 
def groupby_sum(A, I): 
    sortedA, idx = sorted_array_intervals(A,I) 
    return np.add.reduceat(sortedA, idx, axis=0) 

# Groupby mean reduction 
def groupby_mean(A, I): 
    sortedA, idx = sorted_array_intervals(A,I) 
    sums = np.add.reduceat(sortedA, idx, axis=0) 
    count = np.r_[idx[1:] - idx[:-1], A.shape[0] - idx[-1]] 
    return sums/count.astype(float)[:,None] 

# Groupby max reduction 
def groupby_max(A, I): 
    sortedA, idx = sorted_array_intervals(A,I) 
    return np.maximum.reduceat(sortedA, idx, axis=0)

Vì vậy, nếu chúng ta cần tất cả những hoạt động, chúng ta có thể tái sử dụng với một ví dụ của sorted_array_intervals, như vậy -

def groupby_sum_mean_max(A, I): 
    sortedA, idx = sorted_array_intervals(A,I) 
    sums = np.add.reduceat(sortedA, idx, axis=0) 
    count = np.r_[idx[1:] - idx[:-1], A.shape[0] - idx[-1]] 
    avgs = sums/count.astype(float)[:,None] 
    maxs = np.maximum.reduceat(sortedA, idx, axis=0) 
    return sums, avgs, maxs

Phương pháp tiếp cận # 1-B: Phiên bản hỗn hợp (sắp xếp + cắt + giảm)

Đây là phiên bản hỗn hợp giúp lấy mẫu được sắp xếp và các chỉ số thay đổi thành t nhóm tiếp theo, nhưng ở giai đoạn cuối sử dụng slicing để tính tổng mỗi khoảng thời gian và thực hiện điều này lặp lại cho từng nhóm. Việc cắt giúp ở đây khi chúng tôi đang làm việc với views.

Việc thực hiện sẽ giống như thế này -

def reduce_hybrid(A, I, op="avg"): 
    sidx = I.argsort() 
    sI = I[sidx] 
    sortedA = A[sidx] 

    # Get indices at which intervals change. Also, get count in each interval 
    idx = np.r_[ 0, np.flatnonzero(sI[1:] != sI[:-1])+1 ] 
    unq_sI = sI[idx]  

    m = I.max()+1 
    N = A.shape[1] 

    target_dtype = (np.float if op=="avg" else A.dtype) 
    out = np.zeros((m,N),dtype=target_dtype) 
    ss_idx = np.r_[idx,A.shape[0]] 

    npop = {"avg": np.mean, "sum": np.sum, "max": np.max}.get(op) 
    for i in range(len(idx)): 
     out[unq_sI[i]] = npop(sortedA[ss_idx[i]:ss_idx[i+1]], axis=0) 
    return out

Runtime thử nghiệm (Sử dụng cài đặt từ các tiêu chuẩn đăng trong câu hỏi) -

In [432]: d = 500000 
    ...: f = 500 
    ...: n = 500 
    ...: I = np.hstack((np.arange(n), np.random.randint(n, size=(d - n,)))) 
    ...: np.random.shuffle(I) 
    ...: A = np.random.rand(d, f) 
    ...: 

In [433]: %timeit reduce_naive(A, I, op="sum") 
    ...: %timeit reduce_hybrid(A, I, op="sum") 
    ...: 
1 loops, best of 3: 1.03 s per loop 
1 loops, best of 3: 549 ms per loop 

In [434]: %timeit reduce_naive(A, I, op="avg") 
    ...: %timeit reduce_hybrid(A, I, op="avg") 
    ...: 
1 loops, best of 3: 1.04 s per loop 
1 loops, best of 3: 550 ms per loop 

In [435]: %timeit reduce_naive(A, I, op="max") 
    ...: %timeit reduce_hybrid(A, I, op="max") 
    ...: 
1 loops, best of 3: 1.14 s per loop 
1 loops, best of 3: 631 ms per loop

Cách tiếp cận # 2: Sử dụng NumPy bincount

Đây là một cách tiếp cận khác bằng cách sử dụng np.bincount tính tổng hợp dựa trên bin. Như vậy, với nó, chúng ta có thể tính toán số tiền và giá trị trung bình và cũng tránh sắp xếp trong quá trình này, như vậy -

ids = (I[:,None] + (I.max()+1)*np.arange(A.shape[1])).ravel() 
sums = np.bincount(ids, A.ravel()).reshape(A.shape[1],-1).T 
avgs = sums/np.bincount(ids).reshape(A.shape[1],-1).T

Nguồn

2017-03-17 17:08:44 Divakar

Công việc tuyệt vời, tôi đã nhận thấy giảm giá nhưng đã loại bỏ nó bởi vì nó chỉ hoạt động trên khoảng cách liền kề – Maxim

Bạn cũng có thể sử dụng, '_, idx, count = np.unique (sI, return_counts = True, return_index = True)'. Mặc dù nó có một loại, nó thực sự nhanh hơn một chút so với các biểu thức 'r_' của bạn. – hpaulj

Dựa trên thời gian rõ ràng không thuận lợi cho các phương pháp sắp xếp dựa trên tôi đã chọn mở lại câu hỏi ngay bây giờ. – Maxim

Sử dụng python/NumPy JIT biên dịch Numba tôi đã có thể để có được timings ngắn với just-in biên soạn thời gian của thuật toán tuyến tính trực quan:

from numba import jit 

@jit 
def reducenb_avg(A, I): 
    d, f = A.shape 
    n = I.max() + 1 
    result = np.zeros((n, f), np.float) 
    count = np.zeros((n, 1), int) 
    for i in range(d): 
     result[I[i], :] += A[i] 
     count[I[i], 0] += 1 
    return result/count 

@jit 
def reducenb_sum(A, I): 
    d, f = A.shape 
    n = I.max() + 1 
    result = np.zeros((n, f), A.dtype) 
    for i in range(d): 
     result[I[i], :] += A[i] 
    return result 

@jit 
def reducenb_max(A, I): 
    d, f = A.shape 
    n = I.max() + 1 
    result = -np.inf * np.ones((n, f)) 
    count = np.zeros((n, f)) 
    for i in range(d): 
     result[I[i], :] = np.maximum(A[i], result[I[i], :]) 
    return result 

def reduce_numba(A, I, op="avg"): 
    return {"sum": reducenb_sum, "avg": reducenb_avg, "max": reducenb_max}.get(op)(A, I)

Trên vấn đề điểm chuẩn, kết thúc bằng ~ 0,32 giây, khoảng một nửa thời gian của các phương pháp phân loại thô tục thuần túy.

Nguồn

2017-03-17 23:22:55 Maxim

Một công cụ có thể được sử dụng cho điều này là không có bộ đệm add.at:

def add_at(I,A): 
    n = I.max() + 1 
    res = np.zeros((n,A.shape[1])) 
    cnt = np.zeros((n,1)) 
    np.add.at(res, I, A) 
    np.add.at(cnt, I, 1) 
    return res/cnt

(Đó là khá chặt chẽ trong cấu trúc numbareducenb_avg)

In [438]: add_at(I,A) 
Out[438]: 
array([[ 2.66666667, 3.66666667], 
     [ 7.  , 8.  ], 
     [ 8.  , 9.  ]])

Đối với vấn đề nhỏ này nó kiểm tra tốt, so cho người khác, nhưng nó không mở rộng tốt (tăng từ 3x nhanh hơn đến 12x).

Nguồn

2017-03-18 01:26:08 hpaulj

NumPy giảm trên lát không tiếp giáp liên tiếp

Trả lời

Các vấn đề liên quan