如何解决情节:如何处理箱形图中类别之间的不均匀间隙? 完整代码:
我正在尝试使用较大数据集的子集生成箱形图。当我显示该图时,数据中存在奇怪的间隙。有没有一种方法可以将每个图的中心放在正确的标签上。另外,我可以删除图例中的多余标签吗?
fig = go.Figure()
melted_data = melted_data.sort_values(['model','alpha'])
for model,alpha in zip(combos['model'].to_list(),combos['alpha'].to_list()):
data = melted_data[(melted_data.model == model) & (melted_data.alpha == alpha)]
fig.add_trace(go.Box(
y= data['value'],x = data['model'],marker_color=colors[alpha],name = alpha,boxmean=True,))
fig.update_layout(
showlegend=True,boxmode='group',# group together boxes of the different traces for each value of x
boxgap = .1)
fig.show()
更新
以下是重现此问题的代码:
import numpy as np
import pandas as pd
import plotly.graph_objects as go
import plotly
colors = {'A':plotly.colors.qualitative.Plotly[0],'B':plotly.colors.qualitative.Plotly[1],'C':plotly.colors.qualitative.Plotly[2],'D':plotly.colors.qualitative.Plotly[3],'E':plotly.colors.qualitative.Plotly[4],}
models = ['modelA','modelA','modelB','modelC',]
samples = ['A','B','C','D','E','A','C']
score_cols = ['score_{}'.format(x) for x in range(10)]
scores = [(np.random.normal(mu,sd,10).tolist()) for mu,sd in zip((np.random.normal(.90,.06,10)),[.06]*10)]
data = dict(zip(score_cols,scores))
data['model'] = models
data['sample'] = samples
df = pd.DataFrame(data)
melted_data = pd.melt(df,id_vars =['model','sample'],value_vars=score_cols)
fig = go.Figure()
for model,sample in zip(models,samples):
data = melted_data[(melted_data['model'] == model) & (melted_data['sample'] == sample)]
fig.add_trace(go.Box(
y= data['value'],marker_color=colors[sample],name = sample,# group together boxes of the different traces for each value of x
boxgap = .1)
fig.show()
解决方法
我不太清楚您的go.Figure
为何会这样。但是,如果您将数据从宽到长整形并释放px.bar
,您将获得更短,更简洁的代码,并且可以说是更好的视觉效果。我们稍后可以讨论更多细节,但是您会在此情节之后找到完整的代码段:
完整代码:
import numpy as np
import pandas as pd
import plotly.graph_objects as go
import plotly
import plotly.express as px
colors = {'A':plotly.colors.qualitative.Plotly[0],'B':plotly.colors.qualitative.Plotly[1],'C':plotly.colors.qualitative.Plotly[2],'D':plotly.colors.qualitative.Plotly[3],'E':plotly.colors.qualitative.Plotly[4],}
models = ['modelA','modelA','modelB','modelC',]
samples = ['A','B','C','D','E','A','C']
score_cols = ['score_{}'.format(x) for x in range(10)]
scores = [(np.random.normal(mu,sd,10).tolist()) for mu,sd in zip((np.random.normal(.90,.06,10)),[.06]*10)]
data = dict(zip(score_cols,scores))
data['model'] = models
data['sample'] = samples
df = pd.DataFrame(data)
df_long = pd.wide_to_long(df,stubnames='score',i=['model','sample'],j='type',sep='_',suffix='\w+').reset_index()
df_long
fig = px.box(df_long,x='model',y="score",color ='sample')
fig.show()
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。