| 网站首页 | 业界新闻 | 小组 | 威客 | 人才 | 下载频道 | 博客 | 代码贴 | 在线编程 | 编程论坛
欢迎加入我们,一同切磋技术
用户名:   
 
密 码:  
共有 2240 人关注过本帖
标题:如何在查到重复信息时将新信息覆盖旧信息?
只看楼主 加入收藏
ictest
Rank: 3Rank: 3
等 级:论坛游侠
帖 子:333
专家分:114
注 册:2010-2-17
结帖率:70%
收藏
已结贴  问题点数:20 回复次数:10 
如何在查到重复信息时将新信息覆盖旧信息?
写了一段程序,用途是将数个数据文件按格式要求合并成一个文件,现在已经完成大半,有一个功能不会做,想请教一下怎么编写。
具体把程序说一下:
1、提取文件头并写入新文件(所有的数据文件格式一样,前十行为文件头)。                      已完成
2、将所有数据文件的数据段每行提取前7段数据(同时把第6段、第7段数据处理后写入第5段、第6段位置)合并到新文件。                已完成
3、第6段、第7段数据处理后后并起来就是坐标,写入list1。                已完成
4、每将一个坐标写入list1的同时进行比对是否重复,如果重复则将重复坐标写入list2.        已完成
5、如果有重复坐标,则将后面的含有重复坐标的一行(7段数据)覆盖先前的重复坐标的一行(7段数据),也就是说后面重复的覆盖前面被重复的。        不会写

不知我说清楚没有,如有不明白的,请问。

程序.rar (2.62 KB)


数据.rar (2.95 MB)
搜索更多相关主题的帖子: 重复 信息 数据 文件 坐标 
2018-02-01 13:34
ictest
Rank: 3Rank: 3
等 级:论坛游侠
帖 子:333
专家分:114
注 册:2010-2-17
收藏
得分:0 
不好意思啊,第一层的“程序”有些路径的小BUG,现修正了,重新上传,我的电脑运行1到4步共75秒左右,生成的新文件共约3万行不到。

程序2.rar (2.62 KB)
2018-02-01 13:43
ictest
Rank: 3Rank: 3
等 级:论坛游侠
帖 子:333
专家分:114
注 册:2010-2-17
收藏
得分:0 
各位版主和路过的高手,请问有哪位可以帮助帮助我么?谢谢各位了!
2018-02-02 07:58
wds1
Rank: 11Rank: 11Rank: 11Rank: 11
等 级:贵宾
威 望:49
帖 子:393
专家分:2025
注 册:2016-3-10
收藏
得分:0 
建议:
1、读取文件建议一次读取完毕,在处理。这样可以减少I/o时间
2、重复处理可以做一个判重的公共数组变量【例如数组名:count1,可以采用1维或者2维】以及一个判重函数,判重函数原理如下:
  2.1、第一次读取数据,写入公共数组count1(0)
  2.2、调用判重函数。
     利用for i 1= to  ubound(count1) 判断是否包含前面数据
     包含,返回包含位置
     不包含数组:数组用ReDim Preserve count1(ubound(count1)+1)保留前值方式数组+1,当前数据保存在数组最后,函数返回0
  2.3、根据函数返回值,具体处理。

  
2018-02-02 09:05
风吹过b
Rank: 20Rank: 20Rank: 20Rank: 20Rank: 20
等 级:贵宾
威 望:364
帖 子:4947
专家分:30084
注 册:2008-10-15
收藏
得分:0 
跑了一遍你的程序,
按你的程序流程来说,第 5 点无法完成。
原因是:
你是按顺序文件写入,写入后的数据就不能再返回去重写了。
-------------------
我想到的二种方法来达到目的。
一种是写临时文件的办法。二个临时文件
1、一种是写临时文件。
每一行写一个临时文件 ,文件名,就使用坐标来命名,有重复时,直接重写这个文件 。
不需要写入时判断文件是否存在。
最后合并时,使用 list1 里的坐标顺序,依次读取写入。
这种节省内存,但需要大量的IO,性能很低。
这个是代码量最少的写法。

2、一种是保存到内存
数据保存在一个数组里,数组动态增长。数组的下标顺序对应 list1 里的顺序,发现重复时,按 找到了的 Ret 序列重新保存这个元素。
最后 写入文件。
这种按数据量大小来决定占用内存大小,需要使用大量的内存。
这个是代码量偏多,的写法。



授人于鱼,不如授人于渔
早已停用QQ了
2018-02-02 09:31
风吹过b
Rank: 20Rank: 20Rank: 20Rank: 20Rank: 20
等 级:贵宾
威 望:364
帖 子:4947
专家分:30084
注 册:2008-10-15
收藏
得分:0 
发现 SendMessageFind 这里面 存在BUG。
代码无法写下去。先去解决吧!

第 10574 行的坐标是:107,160
然后有一个坐标是: 107,16

SendMessageFind 就找到这行了。



授人于鱼,不如授人于渔
早已停用QQ了
2018-02-02 10:17
wmf2014
Rank: 19Rank: 19Rank: 19Rank: 19Rank: 19Rank: 19
等 级:贵宾
威 望:216
帖 子:2039
专家分:11273
注 册:2014-12-6
收藏
得分:10 
用下述代码覆盖你原代码,应该达到你需求
程序代码:
Private Sub Command1_Click()
Dim S() As String
Dim q As String
Dim i As Integer
Dim z As Integer
Dim bb As String
Dim temp1 As String
Dim temp2 As String
Dim temp3 As String
Dim temp4 As String

Dim Ret As Long
Dim A As String


ttt = Timer

Open (Dir1.Path & "\" & File1.List(0)) For Input As #1
        Do While Not EOF(1)
        Line Input #1, temp1


If InStr(temp1, " ,Bin,Site,X,Y,") = 1 Then
S = Split(temp1, ",")
Text1.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, " , , , , ,Open_Short*,Open_Short*") = 1 Then
S = Split(temp1, ",")
Text2.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, " , , , , ,X_Axial        ,Y_Axial        ,") = 1 Then
S = Split(temp1, ",")
Text3.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, "Unit, , , , ,") = 1 Then
S = Split(temp1, ",")
Text4.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + ","
End If

If InStr(temp1, "Hi_Lmt, , , , ,") = 1 Then
S = Split(temp1, ",")
Text5.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, "Lo_Lmt, , , , ,") = 1 Then
S = Split(temp1, ",")
Text6.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, "Force_val, , , , ,") = 1 Then
S = Split(temp1, ",")
Text7.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, "Force_Rng, , , , ,") = 1 Then
S = Split(temp1, ",")
Text8.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, "Meas_Rng, , , , , ") = 1 Then
S = Split(temp1, ",")
Text9.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
Text10.Text = ""
Exit Do
End If

Loop
Close #1

If Dir(App.Path & "\合并导出", vbDirectory) = "" Then MkDir App.Path & "\合并导出" '如果不存在文件夹则创建之

Open (App.Path & "\合并导出\批数据.csv") For Append As #1

Print #1, Text1.Text
Print #1, Text2.Text
Print #1, Text3.Text
Print #1, Text4.Text
Print #1, Text5.Text
Print #1, Text6.Text
Print #1, Text7.Text
Print #1, Text8.Text
Print #1, Text9.Text
Print #1, Text10.Text
'------------------------------------------------------------------------------------------------------------------------
z = 0
For i = 0 To File1.ListCount - 1
  Open (Dir1.Path & "\" & File1.List(i)) For Input As #2
  Do While Not EOF(2)
    Line Input #2, temp2
    If IsNumeric(Left(temp2, 1)) = True Then
      S = Split(temp2, ",")
      temp3 = S(0) & "," & S(1) & "," & S(2) & "," & Val(S(5)) & "," & Val(S(6)) & "," & S(5) & "," & S(6) & ","
      temp4 = Val(S(5)) & "," & Val(S(6)) & ","   '这里必须补一个逗号或非数字字符,否则坐标(107,16)和坐标(107,160)会认为坐标相同
      Ret = SendMessageFind(List1.hwnd, LB_FINDSTRING, 0, (temp4))
      If Ret = LB_ERR Then
        List1.AddItem temp4
        List2.AddItem temp3
      Else
        List2.List(Ret) = temp3    '用当前重复坐标数据覆盖原数据
        z = z + 1
      End If
    End If
  Loop
  Close #2
Next i
For i = 0 To List2.ListCount - 1
  Print #1, List2.List(i)
Next
Close #1
ttt = Timer - ttt
MsgBox "共用时 " & ttt & "秒,共处理重复坐标" & z & "", 11, "提醒"
'------------------------------------------------------------------------------------------------------------------------
End Sub

能编个毛线衣吗?
2018-02-02 10:45
风吹过b
Rank: 20Rank: 20Rank: 20Rank: 20Rank: 20
等 级:贵宾
威 望:364
帖 子:4947
专家分:30084
注 册:2008-10-15
收藏
得分:10 
我是用数组保存,没使用 list2 保存。

Private Sub Command1_Click()
'合并
Dim S() As String
Dim q As String
Dim i As Long               '在32位系统中,使用 long 运算速度最快
Dim z As Long               'integer 是16位的,只在 16位系统
Dim bb As String
Dim temp1 As String
Dim temp2 As String
Dim temp3 As String
Dim temp4 As String

Dim Ret As Long
Dim A As String

'新增变量定义,没去检查能否使用前面已弃用的变量
Dim s2 As String, D() As String         's2,临时变量,D() 保存数据的内存
Dim DCount As Long, Dmax As Long        'DCount 已使用数据量, DMAX 内存数组当前大小
Dim s3 As String, s4 As String


ttt = Timer

Open (Dir1.Path & "\" & File1.List(0)) For Input As #1
        Do While Not EOF(1)
        Line Input #1, temp1

'字符串连接里的+号建议改写成 &

If InStr(temp1, " ,Bin,Site,X,Y,") = 1 Then
S = Split(temp1, ",")
Text1.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, " , , , , ,Open_Short*,Open_Short*") = 1 Then
S = Split(temp1, ",")
Text2.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, " , , , , ,X_Axial        ,Y_Axial        ,") = 1 Then
S = Split(temp1, ",")
Text3.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, "Unit, , , , ,") = 1 Then
S = Split(temp1, ",")
Text4.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + ","
End If

If InStr(temp1, "Hi_Lmt, , , , ,") = 1 Then
S = Split(temp1, ",")
Text5.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, "Lo_Lmt, , , , ,") = 1 Then
S = Split(temp1, ",")
Text6.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, "Force_val, , , , ,") = 1 Then
S = Split(temp1, ",")
Text7.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, "Force_Rng, , , , ,") = 1 Then
S = Split(temp1, ",")
Text8.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
End If

If InStr(temp1, "Meas_Rng, , , , , ") = 1 Then
S = Split(temp1, ",")
Text9.Text = S(0) + "," + S(1) + "," + S(2) + "," + S(3) + "," + S(4) + "," + S(5) + "," + S(6) + ","
Text10.Text = ""
Exit Do
End If

Loop
Close #1


If Dir(App.Path & "\合并导出", vbDirectory) = "" Then MkDir App.Path & "\合并导出" '如果不存在文件夹则创建之

Open (App.Path & "\合并导出\批数据.csv") For Output As #1           'append 会保留前面的数据,应该使用output直接新建文件

Print #1, Text1.Text
Print #1, Text2.Text
Print #1, Text3.Text
Print #1, Text4.Text
Print #1, Text5.Text
Print #1, Text6.Text
Print #1, Text7.Text
Print #1, Text8.Text
Print #1, Text9.Text
Print #1, Text10.Text

'------------修改了以下代码,按方案二---------------
DCount = -1         '使用之前会自加1,所以这里前一个已使用的元素下标就是-1
Dmax = 1000         '初始给 1000个元素,超过后,再增加
ReDim D(Dmax)
List1.Clear         '清除列表,防止二次执行时,导致查询到上次的坐标上
List2.Clear

For i = 0 To File1.ListCount - 1

Open (Dir1.Path & "\" & File1.List(i)) For Input As #2

        Do While Not EOF(2)
        Line Input #2, temp2
If IsNumeric(Left(temp2, 1)) = True Then
S = Split(temp2, ",")

'写入内容,去掉,生成最终数据后再一并写入
'Print #1, S(0) + "," + S(1) + "," + S(2) + "," + Mid(S(5), 1, (Len(S(5)) - 2)) + "," + Mid(S(6), 1, (Len(S(6)) - 2)) + "," + S(5) + "," + S(6) + ","

'组合成数据
s2 = S(0) + "," + S(1) + "," + S(2) + "," + Mid(S(5), 1, (Len(S(5)) - 2)) + "," + Mid(S(6), 1, (Len(S(6)) - 2)) + "," + S(5) + "," + S(6) + ","

'temp4 = Mid(S(5), 1, (Len(S(5)) - 2)) + "," + Mid(S(6), 1, (Len(S(6)) - 2))      '使用这种方法生成的坐标,有BUG,如 173.16 会查找到 173.160

'改用按固定长度生成坐标。如果坐标有4位的,需要修改成 "0000"
s3 = Format(Mid(S(5), 1, (Len(S(5)) - 2)), "000")
s4 = Format(Mid(S(6), 1, (Len(S(6)) - 2)), "000")
temp4 = s3 & "," & s4

'按数据原本长度进行查找时,会导致BUG
Ret = SendMessageFind(List1.hwnd, LB_FINDSTRING, 0, (temp4))
   
    If Ret = LB_ERR Then
        List1.AddItem temp4
        
        DCount = DCount + 1     '下一个未使用的元素下标
        If DCount > Dmax Then   '如果超过数据最大下标
            Dmax = Dmax + 50    '每次元素按50个申请,减少每次内存复制时间,每个元素空内容时,仅占12字节
            ReDim Preserve D(Dmax)  '保留数据的情况下,重定义数组大小
        End If
        D(DCount) = s2          '保存数据,数组里的顺序是按读到的顺序来排列的
   
    Else                        '找到了重复项
    List1.ListIndex = Ret
    List2.AddItem temp4   
    D(Ret) = s2                  '把新数据保存到旧数据元素里   
    End If
End If
Loop
Close #2
Next i

'写入
For i = 0 To DCount             '把所有已保存在数组里的数据全部写入
    Print #1, D(i)
Next i

Close #1
'---------修改的代码结束-------------
ttt = Timer - ttt
MsgBox "共写入 " & DCount & " 行数据,共用时 " & ttt & " 秒", 11, "提醒"
End Sub

[此贴子已经被作者于2018-2-2 16:41编辑过]


授人于鱼,不如授人于渔
早已停用QQ了
2018-02-02 16:36
ictest
Rank: 3Rank: 3
等 级:论坛游侠
帖 子:333
专家分:114
注 册:2010-2-17
收藏
得分:0 
非常感谢两位大神的回复!两位大神的程序异曲同工的达到了我的需求,这说明vb语言博大精深,往往“一语惊醒梦中人”,自己往往会迷失在自己的思路中,高人的一句指点,就会直击要害,令人醍醐灌顶,恍然大悟,谢谢两位的指导!我还有很长的路要走。
2018-02-06 13:51
ictest
Rank: 3Rank: 3
等 级:论坛游侠
帖 子:333
专家分:114
注 册:2010-2-17
收藏
得分:0 
对了,想再麻烦麻烦两位大神,两位的程序中,能否增加一个类似于进度条的东东来显示处理进度?毕竟一分多钟傻傻的盯着界面没什么变化,然后突然跳出个结束对话框,这样的界面不太人性化。谢谢两位大神了!
2018-02-06 13:57
快速回复:如何在查到重复信息时将新信息覆盖旧信息?
数据加载中...
 
   



关于我们 | 广告合作 | 编程中国 | 清除Cookies | TOP | 手机版

编程中国 版权所有,并保留所有权利。
Powered by Discuz, Processed in 0.045031 second(s), 9 queries.
Copyright©2004-2024, BCCN.NET, All Rights Reserved