?

Log in

Previous Entry | Next Entry

今天学习了在Python中读写Excel文件。发现还是蛮有意思的。

Excel的文件格式:
BIFF:http://sc.openoffice.org/excelfileformat.pdf
Microsoft Excel Releases
:
Excel version MS Windows Release year Apple Macintosh Release year
Excel 2.x Excel 2.0 1987 Excel 2.2 1989
Excel 3.0 Excel 3.0 1990 Excel 3.0 1990
Excel 4.0 Excel 4.0 1992 Excel 4.0 1992
Excel 5.0 Excel 5.0 1993 Excel 5.0 1993
Excel 7.0 Excel 95 1995 ─
Excel 8.0 Excel 97 1997 Excel 98 1998
Excel 9.0 Excel 2000 1999 Excel 2001 2000
Excel 10.0 Excel XP 2001 Excel v.X 2001
Excel 11.0 Excel 2003 2003 Excel 2004 2004

BIFF Versions for Worksheet/Workbook Documents
The following table shows which Excel version writes which file format for worksheet and workbook documents:
Excel version BIFF version Document type
Excel 2.x BIFF2 Worksheet
Excel 3.0 BIFF3 Worksheet
Excel 4.0 BIFF4 Worksheet
Excel 5.0 BIFF5 Workbook
Excel 7.0 BIFF5 Workbook
Excel 8.0 BIFF8 Workbook
Excel 9.0 BIFF8 Workbook
Excel 10.0 BIFF8 Workbook
Excel 11.0 BIFF8 Workbook
BIFF8 contains major changes towards older BIFF versions, for instance the handling of Unicode strings.


1)使用win32com, 这个对运行的系统有要求,它用ActiveX的方式要启动一个Excel的COM服务器,所以只能在Windows系统下使用,并且机器上也必须安装Excel的程序。“Python For Excel" 就是一个基于win32com上的Package。
2)pyExceleartor, 网络上的文章已经很多了,强项应该是在写Excel文件,看PyPI中Development Status是Alpha,所以这个今天没有试验。
3)openpyxl,A Python library to read/write Excel 2007 xlsx/xlsm files
4)xlrd,xlwt,xlutils, 今天学习的就是这个,现在总结一下。


xlrd和xlwt分别是exel read和excel write的意思,是同一个人所写,但是里面很多命名的方法是不一样的,而且不能直接互操作。比如里面的Style这个类,在xlrd中叫XF,在xlwt中叫XFStyle. 甚至连属性名也不一样。比如单元格的背景色,在elrd中叫pattern_colour_index, exwt中叫pattern_fore_colour。现在xlrd的文件已经比较全,xlwt的文档还没有完成,所以只能读那些example程序来学习。

xlutils是xlrd和xlwt的桥梁,里面的filter class可以实现从xlrt到xlwt的Object的转化。现在这个三个项目都hosting在http://www.simplistix.co.uk/

xlrd对BIFF<50有些限制,这个要参阅文档。

下面贴出今天的代码,供参考。

import xlrd
import xlwt

from xlrd import open_workbook,cellnameabs
from xlutils.copy import copy
from xlwt import Style


def copyXF(rdbook,rdxf):
    """
    clone a XFstyle from xlrd XF class,the code is copied from xlutils.copy module
    """
    
    wtxf = xlwt.Style.XFStyle()
    #
    # number format
    #
    wtxf.num_format_str = rdbook.format_map[rdxf.format_key].format_str
    #
    # font
    #
    wtf = wtxf.font
    rdf = rdbook.font_list[rdxf.font_index]
    wtf.height = rdf.height
    wtf.italic = rdf.italic
    wtf.struck_out = rdf.struck_out
    wtf.outline = rdf.outline
    wtf.shadow = rdf.outline
    wtf.colour_index = rdf.colour_index
    wtf.bold = rdf.bold #### This attribute is redundant, should be driven by weight
    wtf._weight = rdf.weight #### Why "private"?
    wtf.escapement = rdf.escapement
    wtf.underline = rdf.underline_type #### 
    # wtf.???? = rdf.underline #### redundant attribute, set on the fly when writing
    wtf.family = rdf.family
    wtf.charset = rdf.character_set
    wtf.name = rdf.name
    # 
    # protection
    #
    wtp = wtxf.protection
    rdp = rdxf.protection
    wtp.cell_locked = rdp.cell_locked
    wtp.formula_hidden = rdp.formula_hidden
    #
    # border(s) (rename ????)
    #
    wtb = wtxf.borders
    rdb = rdxf.border
    wtb.left   = rdb.left_line_style
    wtb.right  = rdb.right_line_style
    wtb.top    = rdb.top_line_style
    wtb.bottom = rdb.bottom_line_style 
    wtb.diag   = rdb.diag_line_style
    wtb.left_colour   = rdb.left_colour_index 
    wtb.right_colour  = rdb.right_colour_index 
    wtb.top_colour    = rdb.top_colour_index
    wtb.bottom_colour = rdb.bottom_colour_index 
    wtb.diag_colour   = rdb.diag_colour_index 
    wtb.need_diag1 = rdb.diag_down
    wtb.need_diag2 = rdb.diag_up
    #
    # background / pattern (rename???)
    #
    wtpat = wtxf.pattern
    rdbg = rdxf.background
    wtpat.pattern = rdbg.fill_pattern
    wtpat.pattern_fore_colour = rdbg.pattern_colour_index
    wtpat.pattern_back_colour = rdbg.background_colour_index
    #
    # alignment
    #
    wta = wtxf.alignment
    rda = rdxf.alignment
    wta.horz = rda.hor_align
    wta.vert = rda.vert_align
    wta.dire = rda.text_direction
    # wta.orie # orientation doesn't occur in BIFF8! Superceded by rotation ("rota").
    wta.rota = rda.rotation
    wta.wrap = rda.text_wrapped
    wta.shri = rda.shrink_to_fit
    wta.inde = rda.indent_level
    # wta.merg = ????
    #
    return wtxf

if __name__== "__main__" :
    
    #opne the excel file
    rb=open_workbook("test.xls",on_demand=True,formatting_info=True)
    
    for attr in ("biff_version","codepage","countries","encoding",
                 #"colour_map","font_list","format_list","format_map",
                 #"user_name",
                 "nsheets"):
        print "%s=%s" %(attr, rb.__getattribute__(attr))

    #show the loaded status for sheets
    for sheet_name in rb.sheet_names():
        print "%s loaded = %s" %(sheet_name, rb.sheet_loaded(sheet_name))
    
    #get the sheet1
    sheet=rb.sheet_by_index(0)
    
    print "%s has %d rows, %d cols" %(sheet.name, sheet.nrows, sheet.ncols)

    print "Shoe the file content"
    
    for rowx in range(0,sheet.nrows):
        for colx in range(0,sheet.ncols):
            
            #get the cell value
            cellvalue=sheet.cell_value(rowx,colx)
            #get the cell type
            celltype=sheet.cell_type(rowx,colx)

            #init the showable value
            showvalue=""
            
            if celltype == xlrd.XL_CELL_DATE:
                try:
                    showval = xlrd.xldate_as_tuple(cellvalue, rb.datemode)
                except xlrd.XLDateError:
                    e1, e2 = sys.exc_info()[:2]
                    showval = "%s:%s" % (e1.__name__, e2)
            elif celltype == xlrd.XL_CELL_ERROR:
                showval = xlrd.error_text_from_code.get(
                cellvalue, 'Unknown error code 0x%02x' % cellvalue)
            else:
                showval = cellvalue

            #get style

            xf=rb.xf_list[sheet.cell_xf_index(rowx,colx)]
            
            #print rb.colour_map[xf.background.background_colour_index]
            #display the cell forecolor
            color=rb.colour_map[xf.background.pattern_colour_index]

            #show the content
            print ("[%s]=%s,[color]=%s" % (cellnameabs(rowx,colx),showval,color))
            
    #show the loaded status for sheets after load sheet1
    for sheet_name in rb.sheet_names():
        print "%s loaded = %s" %(sheet_name, rb.sheet_loaded(sheet_name))
    
    #################
    #change the excel and write it out
    #################
 
    #get the xlwt object from rb object
    wb = copy(rb)
 
    #get the original excel cell style
    rbxf=rb.xf_list[sheet.cell_xf_index(0,0)]

    #copy the xlrd style to xlwt style
    wtrf=copyXF(rb,rbxf)

    #change an attribute of this style, the color index can refer to VBA document
    wtrf.pattern.pattern_fore_colour=15

    #set the new value and new style
    #wb.get_sheet(0).write(0,0,'changed!')
    wb.get_sheet(0).write(0,0,'changed!',wtrf)
    
    #output to file
    wb.save('test2.xls')

Tags:

Comments

( 2 comments — Leave a comment )
(Anonymous)
Feb. 12th, 2011 02:18 am (UTC)
不错的文章
正好需要。-limodou
daniel_zhy
Feb. 12th, 2011 03:06 am (UTC)
Re: 不错的文章
能得到老大的肯定,甚感荣幸。
( 2 comments — Leave a comment )

Latest Month

September 2011
S M T W T F S
    123
45678910
11121314151617
18192021222324
252627282930 

Page Summary

Powered by LiveJournal.com
Designed by Akiko Kurono