如何解决错误字符串索引在文本扩展中必须是整数

如何解决如何解决错误字符串索引在文本扩展中必须是整数

当我在 Dataframe 上方运行时,它显示错误字符串索引必须是整数。我不知道如何解决这个问题。

这是我迄今为止尝试过的代码

# Dictionary of English contractions
contractions_dict = {"ain't": "are not","'s":" is","aren't": "are not","can't": "cannot","can't've": "cannot have","'cause": "because","Could've": "Could have","Couldn't": "Could not","Couldn't've": "Could not have","didn't": "did not","doesn't": "does not","don't": "do not","hadn't": "had not","hadn't've": "had not have","hasn't": "has not","haven't": "have not","he'd": "he would","he'd've": "he would have","he'll": "he will","he'll've": "he will have","how'd": "how did","how'd'y": "how do you","how'll": "how will","I'd": "I would","I'd've": "I would have","I'll": "I will","I'll've": "I will have","I'm": "I am","I've": "I have","isn't": "is not","it'd": "it would","it'd've": "it would have","it'll": "it will","it'll've": "it will have","let's": "let us","ma'am": "madam","mayn't": "may not","might've": "might have","mightn't": "might not","mightn't've": "might not have","must've": "must have","mustn't": "must not","mustn't've": "must not have","needn't": "need not","needn't've": "need not have","o'clock": "of the clock","oughtn't": "ought not","oughtn't've": "ought not have","shan't": "shall not","sha'n't": "shall not","shan't've": "shall not have","she'd": "she would","she'd've": "she would have","she'll": "she will","she'll've": "she will have","should've": "should have","shouldn't": "should not","shouldn't've": "should not have","so've": "so have","that'd": "that would","that'd've": "that would have","there'd": "there would","there'd've": "there would have","they'd": "they would"}

# Regular expression for finding contractions
contractions_re=re.compile('(%s)' % '|'.join(contractions_dict.keys()))

# Function for expanding contractions
def expand_contractions(text,contractions_dict=contractions_dict):
  def replace(match):
    return contractions_dict[match.group(0)]

# Expanding contractions in the reviews
dataset['entitas bernama']=dataset['entitas bernama'].apply(lambda x:expand_contractions(x))

这是错误

dataset['entitas bernama']=dataset['entitas bernama'].apply(lambda x:expand_contractions(x))
error : string indices must be integers

解决方法

这是在熊猫中替换系列值的方法

pandas.Series.replace(to_replace=contractions_dict,inplace=True,value=None,regex=True)

来自https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.replace.html

Dicts 可用于为不同的现有值指定不同的替换值。例如,{'a': 'b','y': 'z'} 将值 'a' 替换为 'b',将 'y' 替换为 'z'。要以这种方式使用字典,value 参数应为 None

示例

contraction_dict = {...} # redacted

In []: twt = pd.read_csv('twitter4000.csv')
Out[]:
                                                      tweets    sentiment
       0    is bored and wants to watch a movie any sugge...    0
       1.               back in miami. waiting to unboard ship  0
       2    @misskpey awwww dnt dis brng bak memoriessss,...   0
       3                    ughhh i am so tired blahhhhhhhhh    0
       4    @mandagoforth me bad! It's funny though. Zacha...   0
    ...     ...     ...
    3995                                    i just graduated    1
    3996            Templating works; it all has to be done     1
    3997                    mommy just brought me starbucks     1
    3998    @omarepps watching you on a House re-run...lov...   1
    3999    Thanks for trying to make me smile I'll make y...   1

    4000 rows × 2 columns

# notice in a glance only the last row has contraction in head +5 tail -5

In []: # check which rows has contractions
       twt[twt.tweets.str.contains('|'.join(contractions_dict.keys()),regex=True)]
Out[]:
                                                       tweets   sentiment
       2    @misskpey awwww dnt dis brng bak memoriessss,...   0
       4    @mandagoforth me bad! It's funny though. Zacha...   0
       5    brr,i'm so cold. at the moment doing my assig...   0
       6    @kevinmarquis haha yep but i really need to sl...   0
       7    eating some ice-cream while I try to see @pete...   0
    ...     ...     ...
    3961                                gonna cousin's b.day.   1
    3968    @kat_n Got to agree it's a risk to put her thr...   1
    3983    About to watch the Lakers win game duece. I'm ...   1
    3986    @countroshculla yeah..needed to get up early.....   1
    3999    Thanks for trying to make me smile I'll make y...   1

    937 rows × 2 columns

In []: twt.tail(5).tweets.replace(to_replace=contractions_dict,regex=True)

Out[]:
    3995                                    i just graduated 
    3996            Templating works; it all has to be done  
    3997                     mommy just brought me starbucks 
    3998    @omarepps watching you on a House re-run...lov...
    3999    Thanks for trying to make me smile I will make...

    Name: tweets,dtype: object 

inplace=True 使用参数 Series.replace 以避免分配回 df 即 twt.tweets = twt.tweets.replace(...)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?
Java在半透明框架/面板/组件上重新绘画。
Java“ Class.forName()”和“ Class.forName()。newInstance()”之间有什么区别?
在此环境中不提供编译器。也许是在JRE而不是JDK上运行?
Java用相同的方法在一个类中实现两个接口。哪种接口方法被覆盖?
Java 什么是Runtime.getRuntime()。totalMemory()和freeMemory()?
java.library.path中的java.lang.UnsatisfiedLinkError否*****。dll
JavaFX“位置是必需的。” 即使在同一包装中
Java 导入两个具有相同名称的类。怎么处理?
Java 是否应该在HttpServletResponse.getOutputStream()/。getWriter()上调用.close()?
Java RegEx元字符(。)和普通点?