pandas笔记
来源:互联网 发布:国家大数据概念股龙头 编辑:程序博客网 时间:2024/06/02 09:16
#!/usr/bin/env python# -*- coding: UTF-8 -*-"""@author: XiangguoSun@contact: sunxiangguodut@qq.com@file: learn_pandas.py@time: 2017/3/8 8:18@software: PyCharm"""import numpy as npfrom pandas import Series, DataFrameimport pandas as pd'''1,基本数据结构''''''1.1 Series: dic+array'''obj_dic={'a':1,'b':2,'c':3}objd=Series(obj_dic)obj = Series([4,7,-5,3],index=['a','b','c','d'])print obj.index,obj.valuesprint obj[['a','c']]print 'b' in objobj_na = Series(obj,index=['a','b','c','d','add'])print obj_naprint obj_na.isnull() # also pd.isnull(obj_na)print obj_na.notnull() # also pd.notnull(obj_na)print obj_na.nameprint obj_na.index.nameobj_na.index=['x','y','z','o','p']obj_na.name='my_table'obj_na.index.name='my_index'print obj_na'''1.2 DataFrame'''data = {'state': ['Ohio', 'Ohio', 'Ohio', 'Nevada', 'Nevada'], 'year': [2000, 2001, 2002, 2001, 2002], 'pop': [1.5, 1.7, 3.6, 2.4, 2.9] }df = DataFrame(data, columns=['year', 'state', 'pop', 'debt'], index=['one', 'two', 'three', 'four', 'five'])print dfprint df.ix['three']df['five']=np.arange(5)print df#增加和删除列df['new_column']=df.state == 'Ohio'print dfdel df['new_column']print df.columns#嵌套字典pop = {'Nevada':{2001:2.4,2002:2.9}, 'Ohio':{2000:1.5,2001:1.7,2002:3.6} }data = DataFrame(pop)print dataprint data.Tprint DataFrame(pop,index=[2001,2002,2003])data.index.name='sunxiangguo'data.columns.name = 'state'print dataprint data.values#返回二维ndarray#索引对象index:不常用,略'''index对象不可修改''''''2,基本功能'''#重新索引obj = Series([4.5,7,-2,4],index=['b','a','c','d'])print objobj2 = obj.reindex(['a','b','c','d','e'])print obj2obj3 = obj.reindex(['a','b','c','d','e'], fill_value=0)print obj3#插值处理obj = Series(['blue','perple','yellow'],index=[0,2,4])print objobj2 = obj.reindex(range(8),method='ffill') # 前向插值print obj2print obj.reindex(range(7),method='pad') # equal to ffillprint obj.reindex(range(7),method='bfill') # 后向插值print obj.reindex(range(7),method='backfill') # equal to bfill
# 丢弃指定轴上的项obj = Series(np.arange(5), index=['a', 'b', 'c', 'd', 'e'])new_obj = obj.drop('c')print objprint new_objprint obj.drop(['c', 'd'])data = DataFrame(np.arange(16).reshape((4, 4)), index=['Ohio', 'Colorado', 'Utah', 'New York'], columns=['one', 'two', 'three', 'four'] )print dataprint data.drop(['Colorado', 'Ohio'])print data.drop('two', axis=1)print data.drop(['two', 'four'], axis=1)# 索引、选取和过滤data = Series(np.arange(4), index=['a', 'b', 'c', 'd'])print dataprint data['b']print data[1]print data[2:4]print data[['b','a','d']]print data[[1,3]]print data[data<2]print data['a':'c']data = DataFrame(np.arange(16).reshape((4, 4)), index=['Ohio', 'Colorado', 'Utah', 'New York'], columns=['one', 'two', 'three', 'four'])print dataprint data['two'] # 选取列print data[['three', 'one']] # 选取列print data[:2] # 注意[]里面是切片或者布尔型数组时,选取的不再是列,而是行print data[data['three'] > 5] # 注意[]里面是切片或者布尔型数组时,选取的不再是列,而是行print data.ix[:2, :2] # 同时选取行和列print data.ix[1:3] # 选取一组行print data.xs('Ohio') # 根据标签选取若干行print data.xs(range(1, 4), axis=1) # 根据标签选取若干列'''print data.icol(2)print data.irow(0)这两个已经被下面两行代码取代'''print data.iloc[:, 2]print data.iloc[0]
0 0
- pandas 笔记
- pandas笔记
- pandas笔记
- Pandas笔记
- Pandas学习笔记:pandas基础
- pandas 学习笔记
- pandas学习笔记
- [pandas] 数据类型学习笔记
- pandas学习笔记
- pandas学习笔记
- pandas使用笔记
- pandas学习笔记
- Pandas学习笔记
- pandas使用笔记
- Pandas常用笔记
- Pandas笔记2
- pandas阅读笔记<1>
- pandas学习笔记
- android adb tcpip 协议流程分析
- 安装rsync+inotify实时同步备份服务器
- Git cherry-pick的使用
- WebServices接口开发总结
- 浅谈C++中的泛型编程
- pandas笔记
- ubuntu中安装chkconfig
- KNN:Strengths, weaknesses, and parameters
- Android之EditText控制禁止输入空格和回车
- Swift01-基本语法
- swift 音乐播放器项目-《lxy的杰伦情歌》开发实战演练
- CentOS 6.4安装GDB
- qml红色下划线去除
- 如何使用sourcetree 或 IDEA 自带的git合并代码?