如何使用 Altair 在一张图中并排显示 3 个条形图

如何解决如何使用 Altair 在一张图中并排显示 3 个条形图

我正在尝试复制此图表,但很难将我创建的所有 3 个图表放入一个图表中。到目前为止,我已经能够使用适当的数据和颜色创建 3 个单独的条形图,但未能成功分层。

我正在复制的图表:

starwars

这是我用来创建每个图表的代码。基本上是相同的代码重复 3 次以创建每个单独的图表,每个图表的名称标记为“seen_movies_top”、“seen_movies_middle”和“seen_movies_bottom”。我觉得我在这里做得太过火了,有一种更简单的方法解决这个问题,但我很高兴至少能够创建每个单独的图表。现在只是为了让它们在同一个图表上..

# fix the labels a bit so will create a mapping to the full names
episodes = ['EI','EII','EIII','EIV','EV','EVI']
names = {
    'EI' : 'The Phantom Meanance','EII' : 'Attack of the clones','EIII' : 'Revenge of the Sith','EIV': 'A New Hope','EV': 'The Empire Strikes Back','EVI' : 'The Return of the Jedi'
}

# going to use this order to sort,so names_l will Now have our sort order
names_l = [names[ep] for ep in episodes]

print("sort order: ",names_l)

seen_every = seen_at_least_one.dropna(subset=['seen_EI','seen_EII','seen_EIII','seen_EIV','seen_EV','seen_EVI'])

# only use those people who have seen at least one movie,let's get the people,toss NAs
# and get the total count

# find people who have at least on of the columns (seen_*) not NaN
seen_at_least_one = sw.dropna(subset=['seen_' + ep for ep in episodes],how='all')
total = len(seen_every)

seen_every = seen_at_least_one.dropna(subset=['seen_EI','seen_EVI'])

print("total who have seen at least one: ",total)

total_rank = len(seen_every)

# calculating the percents and generating a new data frame
percs_seen_top3 = []

# looping over each column and calculating the number of people who have seen the movie
# specifically,filter out the people who are *NaN* for a specific episode (e.g.,ep_EII),count them
# and divide by the percent

for rank_ep in ['rank_' + ep for ep in episodes]:
    #my_value_count = seen_every[rank_ep].value_counts()
    perc_seen_top3 = (seen_every[rank_ep].value_counts()['1'] + seen_every[rank_ep].value_counts()['2'])/ total_rank 
    percs_seen_top3.append(perc_seen_top3)

# creating tuples--pairing names with percents--using "zip" and then making a dataframe
tuples_top = list(zip([names[ep] for ep in episodes],percs_seen_top3))
seen_per_df_top = pd.DataFrame(tuples_top,columns = ['Name','Percentage'])

bars_top = alt.Chart(seen_per_df_top).mark_bar(size=20).encode(
    # encode x as the percent,and hide the axis
    x=alt.X(
        'Percentage',axis=None),y=alt.Y(
        # encode y using the name,use the movie name to label the axis,sort using the names_l
        'Name:N',axis=alt.Axis(tickCount=5,title=''),# we give the sorting order to avoid alphabetical order
         sort=names_l
    )
)

text_top = bars_top.mark_text(
    align='left',baseline='middle',dx=3  # Nudges text to right so it doesn't appear on top of the bar
).encode(
    # we'll use the percentage as the text
    text=alt.Text('Percentage:Q',format='.0%')
)


seen_movies_top = (text_top + bars_top).configure_mark(
    color='#008fd5'
).configure_view(
    # we don't want a stroke around the bars
    strokeWidth=0
).configure_scale(
    # add some padding
    bandPaddingInner=0.2
).properties(
    # set the dimensions of the visualization
    width=500,height=180
).properties(
    # add a title
    title={
    "text":["How People Rate the 'Star Wars' Movies"],"subtitle":["How often each film was rated in the top,middle and bottom third (by 471 respondents who have seen all six films)"]}
).configure_title(
    # customize title and sub-title
    fontSize=30,align='left',anchor ='start',fontWeight='bold',subtitleFontWeight='lighter'
)

seen_movies_top 

percs_seen_middle3 = []

for rank_ep in ['rank_' + ep for ep in episodes]:
    #my_value_count = seen_every[rank_ep].value_counts()
    perc_seen_middle3 = (seen_every[rank_ep].value_counts()['3'] + seen_every[rank_ep].value_counts()['4'])/ total_rank 
    percs_seen_middle3.append(perc_seen_middle3)
    
tuples_middle = list(zip([names[ep] for ep in episodes],percs_seen_middle3))
seen_per_df_middle = pd.DataFrame(tuples_middle,'Percentage'])

# ok,time to make the chart... let's make a bar chart (use mark_bar)
bars_middle = alt.Chart(seen_per_df_middle).mark_bar(size=20).encode(
    # encode x as the percent,# we give the sorting order to avoid alphabetical order
         sort=names_l
    )
)

# at this point we don't really have a great plot (it's missing the annotations,titles,etc.)
bars_middle

text_middle = bars_middle.mark_text(
    align='left',format='.0%')
)

seen_movies_middle = (text_middle + bars_middle).configure_mark(
    # we don't love the blue
    color='#69a14f'
).configure_view(
    # we don't want a stroke around the bars
    strokeWidth=0
).configure_scale(
    # add some padding
    bandPaddingInner=0.2
).properties(
    # set the dimensions of the visualization
    width=500,subtitleFontWeight='lighter'
)

seen_movies_middle 

percs_seen_bottom3 = []

for rank_ep in ['rank_' + ep for ep in episodes]:
    #my_value_count = seen_every[rank_ep].value_counts()
    perc_seen_bottom3 = (seen_every[rank_ep].value_counts()['5'] + seen_every[rank_ep].value_counts()['6'])/ total_rank 
    percs_seen_bottom3.append(perc_seen_bottom3)  

tuples_bottom = list(zip([names[ep] for ep in episodes],percs_seen_bottom3))
seen_per_df_bottom = pd.DataFrame(tuples_bottom,'Percentage'])


# ok,time to make the chart... let's make a bar chart (use mark_bar)
bars_bottom = alt.Chart(seen_per_df_bottom).mark_bar(size=20).encode(
    # encode x as the percent,etc.)
bars_bottom

text_bottom = bars_bottom.mark_text(
    align='left',format='.0%')
)


seen_movies_bottom = (text_bottom + bars_bottom).configure_mark(
    # we don't love the blue
    color='#fd3a4a'
).configure_view(
    # we don't want a stroke around the bars
    strokeWidth=0
).configure_scale(
    # add some padding
    bandPaddingInner=0.2
).properties(
    # set the dimensions of the visualization
    width=500,subtitleFontWeight='lighter'
)

seen_movies_bottom

top

middle

bottom

解决方法

我通常不使用altair,所以我做了很多研究并创建了这个,所以代码可能不一致。您期望的输出是注释文本颜色不是黑色的地方。根据我的经验,这是无法解决的。此外,由于目标不是格式化数据,我为图表创建了示例数据并创建了图表。

import pandas as pd
import numpy as np
import io
import altair as alt
from altair import datum

data = '''
episode name "Top third" "Middle third" "Bottom third"
1 "The Phantom Menace" 0.16 0.37 0.46
2 "Attack of the Clones" 0.14 0.29 0.57
3 "Revenge of the Sith" 0.13 0.40 0.47
4 "A New Hope" 0.50 0.31 0.19
5 "The Empire Strikes Back" 0.64 0.22 0.14
6 "Return of the Jedi" 0.43 0.41 0.17
'''

df = pd.read_csv(io.StringIO(data),delim_whitespace=True)
df = df.set_index(['episode','name']).stack().to_frame(name='percentage').reset_index()
df.columns = ['episode','name','rank','percentage']
episode 名称 排名 百分比
0 1 幻影威胁 前三分之一 0.16
1 1 幻影威胁 中间三分之一 0.37
2 1 幻影威胁 倒数第三 0.46
3 2 克隆人的进攻 前三分之一 0.14
4 2 克隆人的进攻 中间三分之一 0.29
5 2 克隆人的进攻 倒数第三 0.57
6 3 西斯的复仇 前三分之一 0.13
7 3 西斯的复仇 中间三分之一 0.4
8 3 西斯的复仇 倒数第三 0.47
9 4 新的希望 前三分之一 0.5
10 4 新的希望 中间三分之一 0.31
11 4 新的希望 倒数第三 0.19
12 5 帝国反击 前三分之一 0.64
13 5 帝国反击 中间三分之一 0.22
14 5 帝国反击 倒数第三 0.14
15 6 绝地归来 前三分之一 0.43
16 6 绝地归来 中间三分之一 0.41
17 6 绝地归来 倒数第三 0.17
domain = ['Top third','Middle third','Bottom third']
range_ = ['green','blue','red']

bar1 = alt.Chart(df,title=domain\[0\]).mark_bar().encode(
    alt.X('percentage:Q',axis=None,title=domain\[0\]),alt.Y('name:O',sort=df.name.unique(),title=''),color=alt.Color('rank:N',legend=None,scale=alt.Scale(domain=domain,range=range_)),).transform_filter(
    (datum.rank == 'Top third')
).properties(
    width=50
)

text1 = bar1.mark_text(
    align='left',baseline='middle',dx=3
).encode(
    text=alt.Text('percentage:Q',format='.0%')
)

bar2 = alt.Chart(df,title=domain\[1\]).mark_bar().encode(
    alt.X('percentage:Q',axis=None),).transform_filter(
    (datum.rank == 'Middle third')
).properties(
    width=50
)

text2 = bar2.mark_text(
    align='left',format='.0%')
)

bar3 = alt.Chart(df,title=domain\[2\]).mark_bar().encode(
    alt.X('percentage:Q',).transform_filter(
    (datum.rank == 'Bottom third')
).properties(
    width=50
)

text3 = bar3.mark_text(
    align='left',format='.0%')
)

alt.hconcat(
    bar1+text1,bar2+text2,bar3+text3,title=alt.TitleParams(
        text="How People Rate the 'Star Wars' Movies",subtitle=\["How often each film was rated in the top,middle and bottom third ","(by 471 respondents who have seen all six films)"\],)
).configure_axis(
    grid=False,).configure_view(
    strokeWidth=0
).configure(
    background='#dcdcdc'
)

enter image description here

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?
Java在半透明框架/面板/组件上重新绘画。
Java“ Class.forName()”和“ Class.forName()。newInstance()”之间有什么区别?
在此环境中不提供编译器。也许是在JRE而不是JDK上运行?
Java用相同的方法在一个类中实现两个接口。哪种接口方法被覆盖?
Java 什么是Runtime.getRuntime()。totalMemory()和freeMemory()?
java.library.path中的java.lang.UnsatisfiedLinkError否*****。dll
JavaFX“位置是必需的。” 即使在同一包装中
Java 导入两个具有相同名称的类。怎么处理?
Java 是否应该在HttpServletResponse.getOutputStream()/。getWriter()上调用.close()?
Java RegEx元字符(。)和普通点?