Aspiring Data Science
383 subscribers
465 photos
12 videos
12 files
2.16K links
Заметки экономиста о программировании, прогнозировании и принятии решений, научном методе познания.
Контакт: @fingoldo

I call myself a data scientist because I know just enough math, economics & programming to be dangerous.
Download Telegram
#toboml #stats #ecdf

The main thing we are looking out for are plateaux on the eCDF.

import numpy as np
x_eCDF = np.sort(data)
y_eCDF = np.arange(1, len(data)+1 ) / len(data)

and to plot both the histogram and the eCDF together

import matplotlib.pyplot as plt
fig, ax1 = plt.subplots(figsize=(15, 7))
twin = ax1.twinx()
ax1.hist(data, bins=12, density=False)
twin.plot(x_eCDF, y_eCDF, linewidth=5, color="red")
twin.set_ylim(bottom=0, top=None)
plt.show();
#toboml #stats #ecdf #kstest

Kolmogorov-Smirnov test is based on finding the largest vertical distance between the two eCDF.