如何在python中绘制和显示数据集的分布？

如何解决如何在python中绘制和显示数据集的分布？

我有一个数据集，其中一小部分如下所示，

data = [ ['2018-01-01',1.323,'AI',2000,'Communications','Mothers'],['2018-01-02',1.525,1500,['2018-01-03',1.045,500,['2018-01-04',1.845,600,['2018-01-05',1.446,'BOC',550,'Pharmaceuticals','JASDAQ Standard'],2.110,3201,2.150,5200,2.810,1980,5.199,'CAT','Real Estate',['2018-01-06',4.980,450,['2018-01-07',4.990,3000,'Mothers']]
df = pd.DataFrame(data,columns =['date','price','ticker','volume','Sector','Market Division'])

我想显示哪个市场部门的库存更多，来自哪个部门。我尝试了如下树形图，但没有用我该怎么做？

import plotly.express as px
import numpy as np

a=df.groupby(['Market Division','Sector']).count()

a["Exchange"] = "Exchange" # in order to have a single root node
fig = px.treemap(a,path=['Exchange','Market Division','ticker'],values='ticker')
fig.show()

解决方法

您可以尝试使用 stacked plots。这是一个虚拟示例：

import matplotlib.pyplot as plt
labels = list(set([md for md in df['Market Division']]))
fig,ax = plt.subplots()
jasdaq = [3434,5454,45454] 
mothers = [35345,64534,43543]
ax.bar(labels,jasdaq[0],label='Pharmaceuticals')
ax.bar(labels,jasdaq[1],label='Communication')
ax.bar(labels,jasdaq[2],label='Real Estate')
ax.bar(labels,mothers[0],mothers[1],mothers[2],label='Real Estate')

ax.legend()
plt.show()

您需要先计算每个市场部门的每个部门，然后替换 jasdaq 和 Mothers 以获得您想要的真实图。