将Pandas GroupBy对象转换为DataFrame

问题:

我从这样的输入数据开始

df1 = pandas.DataFrame( { 
    "Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , 
    "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"] } )

打印时显示为:

   City     Name
0   Seattle    Alice
1   Seattle      Bob
2  Portland  Mallory
3   Seattle  Mallory
4   Seattle      Bob
5  Portland  Mallory

分组简单:

g1 = df1.groupby( [ "Name", "City"] ).count()

并且打印产生GroupBy对象:

                  City  Name
Name    City
Alice   Seattle      1     1
Bob     Seattle      2     2
Mallory Portland     2     2
        Seattle      1     1

但是我最终想要的是包含GroupBy对象中所有行的另一个DataFrame对象。换句话说,我想得到以下结果:

                  City  Name
Name    City
Alice   Seattle      1     1
Bob     Seattle      2     2
Mallory Portland     2     2
Mallory Seattle      1     1

我不太明白如何在熊猫文档中完成这一点。欢迎任何提示。

回答:

 g1这里is一个DataFrame。它有一个分层索引,但是:

In [19]: type(g1)
Out[19]: pandas.core.frame.DataFrame

In [20]: g1.index
Out[20]: 
MultiIndex([('Alice', 'Seattle'), ('Bob', 'Seattle'), ('Mallory', 'Portland'),
       ('Mallory', 'Seattle')], dtype=object)

也许你想要这样的东西?

In [21]: g1.add_suffix('_Count').reset_index()
Out[21]: 
      Name      City  City_Count  Name_Count
0    Alice   Seattle           1           1
1      Bob   Seattle           2           2
2  Mallory  Portland           2           2
3  Mallory   Seattle           1           1

或者像:

In [36]: DataFrame({'count' : df1.groupby( [ "Name", "City"] ).size()}).reset_index()
Out[36]: 
      Name      City  count
0    Alice   Seattle      1
1      Bob   Seattle      2
2  Mallory  Portland      2
3  Mallory   Seattle      1

 
 
Code问答: http://codewenda.com/topics/python/
Stackoverflow: Converting a Pandas GroupBy object to DataFrame

*转载请注明本文链接以及stackoverflow的英文链接

发表评论

电子邮件地址不会被公开。 必填项已用*标注

+ 3 = 11