网络书屋(Web Reading Room)

A blogging framework for hackers.

批量修改pdf文件名以及创建wiki Links

bash、awk、sed等的好处,就是可以专一的完成你的需求,但是也存在很多不足,借着批量重命令和批量导出链接到vimwiki的wiki中,形成[[local:文件名路径]]的过程,巩固学习linux命令, it is homework(learning process)。

提取路径,用于vimwiki中,当作快速链接

  • 如果IFS是”“,那么相当于一个文件名特别长会分成很多行显示,所以这边设置为 IFS=$‘\n’

  • [a-z]star替换原先的star,目的是去除点号。

  • 使用echo和管道命令传递信息给sed或者awk等
  • 在sed中似乎用^$等位置字符进行替换,如果是文件夹则进行名字替换,并且遍历当前文件夹
  • 使用双重for循环进行控制
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
IFS=$'\n';
count=1;
countDir=1;
specialCharacter='pages';
generateChapter() # @Description : 对不同文件进行不同处理
                  # @usage       : generatechapter
{
    # 妙用find 得到当前目录的相对路径 不需要不断的进入目录
    for var2 in `find . -name "[a-z]*"`
    do
        if [[ -d  $var2 ]] # < cannot . Error
        then
            #echo "fuck"
            var=`echo $var2|sed 's/^./F:\/ScienceBase.Attachments\/WindEnergy/g'|sed 's/^/[[local:/g'|sed 's/$/]]/g'`
            printf "= $countDir. [ ] $var =\n" # 使用#号来删除之前的点号
            countDir=$(($countDir+1));

            for tempVar in `find $var2 -name "*.pdf"`
            do
                temp1=`echo $tempVar|sed 's/^./F:\/ScienceBase.Attachments\/WindEnergy/g'` 
    #            # echo ${var2} ${var2:0:$((${var2}-18))}.pdf  
                 varr=`echo $temp1|sed 's/^/[[local:/g'|sed 's/$/]]/g'`;
                #var=`echo $var2|sed 's/^./F:\/ScienceBase.Attachments\/WindEnergy/g'|sed 's/^/[[local:/g'|sed 's/$/]]/g'` 
                printf "\t$count. [ ] ${varr}\n" # 这边需要去除到第一个点号,这是才得到的处理方法
                count=$(($count+1));

            done
            count=1;
        fi
        

    done
}

generateChapter

删除不必要的名字特殊字符,重命名

  • 删除文件pdf名字不必要的(pages 110—30)等信息。
  • 使用awk printf产生逗号分隔字符串,使用xargs -d, mv提取以逗号分隔的字段, 并且对文件名进行重命名(在我找的多种方法中,就他有效)
  • xargs -n 2 表示按照空格划分的方式 提取两个参数,逐个进行。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#!/bin/bash -
#===============================================================================
#
#          FILE: b.sh
#
#         USAGE: ./b.sh
#
#   DESCRIPTION: 
#
#       OPTIONS: ---
#  REQUIREMENTS: ---
#          BUGS: ---
#         NOTES: ---
#        AUTHOR: Ye Zhao Liang (Vimer), zhaoturkkey@163.com
#  ORGANIZATION: BrokenSun
#       CREATED: 2017/7/4 23:01:31
#      REVISION:  ---
#===============================================================================

IFS=$'\n';
count=1;
countDir=1;
specialCharacter='pages';
generateChapter() # @Description : 对不同文件进行不同处理
                  # @usage       : generatechapter
{
    # 妙用find 得到当前目录的相对路径 不需要不断的进入目录
    #for var2 in `find . -name "*"`
    for var2 in `find . -name "windEnergy201*"`
    do
        if [[ -d  $var2 ]] # < cannot . Error
        then
                cd $var2;
                for var in `find . -name "*"`;do echo $var|awk '/pages/{printf("%s,%s",$0,substr($0,0,length($0)-22)".pdf")|"xargs -d, mv ";}';done 
                cd ..;
        fi

    done
}

generateChapter

注意可以使用 ,学到技巧1中的检测工具,查看你的修改是否完全正确,如果出现文件名中有逗号的情况,通常pages没有删掉,原因是xargs也是按照,号进行分割,所以改进方法是使用分号输出

改进代码

1
2
for var in `find . -name "*"`;do echo $var|awk '/pages/{printf("%s;%s",$0,substr($0,0,length($0)-22)".pdf")|"xargs -d; mv ";}';done 

最终结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
= 1. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system]] =
  1. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch1.pdf]]
  2. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch10.pdf]]
  3. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch11.pdf]]
  4. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch12.pdf]]
  5. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch13.pdf]]
  6. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch14.pdf]]
  7. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch15.pdf]]
  8. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch16.pdf]]
  9. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch2.pdf]]
  10. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch3.pdf]]
  11. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch4.pdf]]
  12. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch5.pdf]]
  13. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch6.pdf]]
  14. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch7.pdf]]
  15. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch8.pdf]]
  16. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/ch9.pdf]]
  17. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/fmatter.pdf]]
  18. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/index.pdf]]
  19. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/harmonic power system/scard.pdf]]
= 2. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/Offshore Wind Energy Generation Control, Protection, and Integration to Electrical Systems/offshoreWindEnergy]] =
  1. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/Offshore Wind Energy Generation Control, Protection, and Integration to Electrical Systems/offshoreWindEnergy/app1.pdf]]
  2. [ ] [[local:F:/ScienceBase.Attachments/WindEnergy/Offshore Wind Energy Generation Control, Protection, and Integration to Electrical Systems/offshoreWindEnergy/app2.pdf]]

学到的技巧

  1. awk两种表示判断,if判断得用分号 如果不用分号隔开会报错

注意分号!!!

1
awk '{if ($1==1) print "A"; else if ($1==2) print "B"; else print "C"}'

对应的bash使用的是if,then,else,fi的形式,且不用分号间隔语句

1
2
3
4
5
6
7
8
9
10
11

 for var in `find . -name "*"`
    do
        if [[ -d  $var ]] # < cannot . Error
        then
            printf "$var\n" # 使用#号来删除之前的点号
        else
            printf "\t${var}\n" # 这边需要去除到第一个点号,这是才得到的处理方法
        fi

    done

awk的’/page/{}‘等效于’if($0~/dfd/){}’

下面的命令,也是一种检查上述程序正确与否的一种工具,可以看出哪些pdf文件依然有pages的字段

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
YeZhao@DESKTOP-YeZhao /cygdrive/f/ScienceBase.Attachments/WindEnergy
$ find . -name "*"|awk '{if($0~/pages/){print $0}}'
./windEnergy2009-i6/Characterizing future large, rapid changes in aggregated wind power using Numerical Weather Prediction spatial fields (pages 542–555).pdf
./windEnergy2012-i1/Modeling wake effects in large wind farms in complex terrain the problem, the methods and the issues (pages 161–182).pdf
./windEnergy2012-i2/The Betz–Joukowsky limit on the contribution to rotor aerodynamics by the British, German and Russian scientific schools (pages 335–344).pdf
./windEnergy2012-i3/Computational fluid dynamics simulation of the aerodynamics of a high solidity, small-scale vertical axis wind turbine (pages 349–361).pdf
./windEnergy2012-i3/Correction factors for NRG #40 anemometers potentially affected by dry friction whip characterization, analysis, and validation (pages 489–502).pdf
./windEnergy2012-i4/Analysis of wake measurements from the ECN Wind Turbine Test Site Wieringermeer, EWTW (pages 575–591).pdf
./windEnergy2012-i5/Atmospheric stability and turbulence fluxes at Horns Rev—an intercomparison of sonic, bulk and WRF model data (pages 717–731).pdf
./windEnergy2013-11/Modeling, simulation and control of a wind turbine with a hydraulic transmission system (pages 1259–1276).pdf
./windEnergy2013-8/Indicial lift response function an empirical relation for finite-thickness airfoils, and effects on aeroelastic simulations (pages 681–693).pdf
./windEnergy2013-8/Simulating the dynamics of wind turbine blades part I, model development and verification (pages 694–710).pdf
./windEnergy2013-8/Simulating the dynamics of wind turbine blades part鈥塈I, model validation and uncertainty quantification (pages 741–758).pdf
./windEnergy2014-2/An assessment of the impact of reduced averaging time on small wind turbine power curves, energy capture predictions and turbulence intensity measurements (pages 337–342).pdf
./windEnergy2014-9/Dynamic response analysis of wind turbines under blade pitch system fault, grid loss, and shutdown events (pages 1385–1409).pdf
./windEnergy2015-10/Rapid optimization of stall-regulated wind turbine blades using a frequency-domain method Part 1, loads analysis (pages 1703–1723).pdf
./windEnergy2015-11/Application and validation of incrementally complex models for wind turbine aerodynamics, isolated wind turbine in uniform inflow conditions (pages 1893–1916).pdf
./windEnergy2015-2/Wind turbine boundary layer arrays for Cartesian and staggered configurations Part II, low-dimensional representations via the proper orthogonal decomposition (pages 297–315).pdf
./windEnergy2015-2/Wind turbine boundary layer arrays for Cartesian and staggered configurations-Part I, flow field and power measurements (pages 277–295).pdf
./windEnergy2015-4/Utilization of machine-learning algorithms for wind turbine site suitability modeling in Iowa, USA (pages 713–727).pdf
./windEnergy2015-6/Rapid optimization of stall-regulated wind turbine blades using a frequency-domain method Part 2, cost function selection and results (pages 955–977).pdf
./windEnergy2015-7/Variable geometry wind turbine for performance enhancement, improved survivability and reduced cost of energy (pages 1303–1311).pdf
./windEnergy2016-11/Reliability of wind turbines modeled by a Poisson process with covariates, unobserved heterogeneity and seasonality (pages 1991–2002).pdf
./windEnergy2016-2/Cylindrical vortex wake model skewed cylinder, application to yawed or tilted rotors (pages 345–358).pdf
./windEnergy2016-6/Effects of low temperature on the mechanical properties of glass fibre–epoxy composites static tension, compression, R = 0.1 and R =鈭▒ 1 fatigue of ±45laminates (pages 1023–1041).pdf
./windEnergy2016-6/Failure rate, repair time and unscheduled O&M cost analysis of offshore wind turbines (pages 1107–1119).pdf
./windEnergy2017-02/Verifying the Blade Element Momentum Method in unsteady, radially varied, axisymmetric loading using a vortex ring model (pages 269–288).pdf

2.awk的BEGIN

1
2
3
4
5
6
7
8
9
10
11
function name()
{}

BEGIN{
}
{
    
}
END{

}

3.awk gsub

1
echo "a b c 2011-11-22 a:d" | awk 'gsub(/-/,"",$4)'

4.awk变量定义

BEGIN中定义1 awk -v单行定义变量

awk内置变量, 包括FS,OFS,NR,NFR,NF,$0,$1,$2,ARGC,ARGV1等。

5.awk定义函数

awk的函数定义是在BEGIN{},{},END{}之外的,和他们平级的关系

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#!/usr/bin/awk -f
#===============================================================================
#
#          File:  func.awk
# 
#   Description:  awk -f func.awk file
#           file内容为400
# 
#   VIM Version:  7.0+
#        Author:  Ye Zhao Liang (Vimer), zhaoturkkey@163.com
#  Organization:  BrokenSun
#       Version:  1.0
#       Created:  2017/7/5 16:06:33
#      Revision:  ---
#       License:  Copyright (c) 2017, Ye Zhao Liang
#===============================================================================
# 
function b()
{
print "b.in.$1="$1;
}
{
v=100; y=200
print "a.in.v="v;
print "a.in.y="y;

a(y);
b();
print "a.out.v="v;
print "a.out.y="y;
}


function a(y)
{
print "(a)v="v;
v=v+$1+y;
y=300;
}

6.bash四种变量截取

  1. ${var#.*} 从左到右,满足#之后条件的最小长度
  2. ${var##.*} 从左到右,满足##之后条件的最大长度
  3. ${var%.*} 从右到左,满足%之后条件的最小长度
  4. ${var%%.*} 从右到左,满足%%之后条件的最小长度

在awk中可以使用substr($1,0,length($1)–..)实现类似的功能。

7.bash中的包含关系

包含: 即一个大的部分包含小的部分(member) 等价: 即两个东西等价(equal) 比较:一般是两个数,另外也可以是字符串。

bash几种包含关系用法

1
2
3
4
5
6
7
8
strA="helloworld"
strB="low"
if [[ $strA =~ $strB ]]
then
    echo "包含"
else
    echo "不包含"
fi

8.awk去除左右空格

第5个知识点阐述了函数的定义方式,现在来运用一下, awk去除左右空格,再一次使用中发现所有的文件名后缀中多了一个空格,于是尝试消掉空格,想着用awk实现。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
function ltrim(s) { sub(/^[ \t\r\n]+/, "", s); return s }
function rtrim(s) { sub(/[ \t\r\n]+$/, "", s); return s }
function trim(s) { return rtrim(ltrim(s)); }
BEGIN{
        FS=","
}

{
        $0 = rtrim($0);
        if($2!="-" && $3=="-")
                a[$4]++;
        {
        if($4!="-")
                b[$4]++;
        else
                b[$5]++;
        }
}

END{
        print "   client    incr_num_day";
        for(i in a) printf("%10s   %d\n",i,a[i])
        print "\n\n   client    all_num";                                                                                                                                                     
        for(j in b) printf("%10s   %d\n",j,b[j]);
}

9.awk调用系统命令

方法

  1. ready:
1
2
touch c.txt
touch d.txt

II. a.txt:

1
2
c.txt
d.txt

III. code:

1
2

awk '{cmd="rm "$0;system(cmd)}' a.txt   

10.awk重定向和管道

有时候直接可以在awk使用管道,提供给shell,比如print|Sort,

1
awk '{print $1, $2 | "sort" }'

11.windows下的cygwin使用脚本

必须得使用

1
2
3
dos2unix.exe *脚本名字
dos2unix.exe a.sh
dos2unix.exe func.awk

这样执行shell才有效。

12.awk性能比shell更高

参考链接

1
2
3
4
5
6
7
8
9
10
11
12
13
14
性能比较

[chengmo@localhost nginx]# time (awk 'BEGIN{ total=0;for(i=0;i<=10000;i++){total+=i;}print total;}')
50005000

real    0m0.003s
user    0m0.003s
sys     0m0.000s
[chengmo@localhost nginx]# time(total=0;for i in $(seq 10000);do total=$(($total+i));done;echo $total;)
50005000

real    0m0.141s
user    0m0.125s
sys     0m0.008s 

结论:在awk中执行算术运算,比在bash中执行更好一些。