Skip to content

Commit

Permalink
Merge pull request #3 from WeBankFinTech/dev-0.9.0
Browse files Browse the repository at this point in the history
Dev 0.9.0
  • Loading branch information
yuchenyao authored Jul 1, 2020
2 parents fa97771 + 0466aca commit 9051b34
Show file tree
Hide file tree
Showing 31 changed files with 603 additions and 8 deletions.
82 changes: 82 additions & 0 deletions docs/zh_CN/ch3/DSS_User_Tests1_Scala.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# DSS用户测试样例1:Scala

DSS用户测试样例的目的是为平台新用户提供一组测试样例,用于熟悉DSS的常见操作,并验证DSS平台的正确性

![image-20200408211243941](../../../images/zh_CN/chapter3/tests/home.png)

## 1.1 Spark Core(入口函数sc)

在Scriptis中,已经默认为您注册了SparkContext,所以直接使用sc即可:

### 1.1.1 单Value算子(Map算子为例)

```scala
val rddMap = sc.makeRDD(Array((1,"a"),(1,"d"),(2,"b"),(3,"c")),4)
val res = rddMap.mapValues(data=>{data+"||||"})
res.collect().foreach(data=>println(data._1+","+data._2))
```

### 1.1.2 双Value算子(union算子为例)

```scala
val rdd1 = sc.makeRDD(1 to 5)
val rdd2 = sc.makeRDD(6 to 10)
val rddCustom = rdd1.union(rdd2)
rddCustom.collect().foreach(println)
```

### 1.1.3 K-V算子(reduceByKey算子为例子)

```scala
val rdd1 = sc.makeRDD(List(("female",1),("male",2),("female",3),("male",4)))
val rdd2 = rdd1.reduceByKey((x,y)=>x+y)
rdd2.collect().foreach(println)
```

### 1.1.4 执行算子(以上collect算子为例)

### 1.1.5 从hdfs上读取文件并做简单执行

```scala
case class Person(name:String,age:String)
val file = sc.textFile("/test.txt")
val person = file.map(line=>{
val values=line.split(",")

Person(values(0),values(1))
})
val df = person.toDF()
df.select($"name").show()
```



## 1.2 UDF函数测试

### 1.2.1 函数定义



```scala
def ScalaUDF3(str: String): String = "hello, " + str + "this is a third attempt"
```

### 1.2.2 注册函数

函数-》个人函数-》右击新增spark函数=》注册方式同常规spark开发

![img](../../../images/zh_CN/chapter3/tests/udf1.png)

## 1.3 UDAF函数测试

### 1.3.1 Jar包上传

​ idea上开发一个求平均值的udaf函数,打成jar(wordcount)包,上传dss jar文件夹。

![img](../../../images/zh_CN/chapter3/tests/udf2.png)

### 1.3.2 注册函数

函数-》个人函数-》右击新增普通函数=》注册方式同常规spark开发

![img](../../../images/zh_CN/chapter3/tests/udf-3.png)
148 changes: 148 additions & 0 deletions docs/zh_CN/ch3/DSS_User_Tests2_Hive.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# DSS用户测试样例2:Hive

DSS用户测试样例的目的是为平台新用户提供一组测试样例,用于熟悉DSS的常见操作,并验证DSS平台的正确性

![image-20200408211243941](../../../images/zh_CN/chapter3/tests/home.png)

## 2.1 数仓建表

​ 进入“数据库”页面,点击“+”,依次输入表信息、表结构和分区信息即可创建数据库表:

<img src="../../../images/zh_CN/chapter3/tests/hive1.png" alt="image-20200408212604929" style="zoom:50%;" />

![img](../../../images/zh_CN/chapter3/tests/hive2.png)

​ 通过以上流程,分别创建部门表dept、员工表emp和分区员工表emp_partition,建表语句如下:

```sql
create external table if not exists default.dept(
deptno int,
dname string,
loc int
)
row format delimited fields terminated by '\t';

create external table if not exists default.emp(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
row format delimited fields terminated by '\t';

create table if not exists emp_partition(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
partitioned by (month string)
row format delimited fields terminated by '\t';
```

**导入数据**

目前需要通过后台手动批量导入数据,可以通过insert方法从页面插入数据

```sql
load data local inpath 'dept.txt' into table default.dept;
load data local inpath 'emp.txt' into table default.emp;
load data local inpath 'emp1.txt' into table default.emp_partition;
load data local inpath 'emp2.txt' into table default.emp_partition;
load data local inpath 'emp2.txt' into table default.emp_partition;
```

其它数据按照上述语句导入,样例数据文件路径在:`examples\ch3`

## 2.2 基本SQL语法测试

### 2.2.1 简单查询

```sql
select * from dept;
```

### 2.2.2 Join连接

```sql
select * from emp
left join dept
on emp.deptno = dept.deptno;
```

### 2.2.3 聚合函数

```sql
select dept.dname, avg(sal) as avg_salary
from emp left join dept
on emp.deptno = dept.deptno
group by dept.dname;
```

### 2.2.4 内置函数

```sql
select ename, job,sal,
rank() over(partition by job order by sal desc) sal_rank
from emp;
```

### 2.2.5 分区表简单查询

```sql
show partitions emp_partition;
select * from emp_partition where month='202001';
```

### 2.2.6 分区表联合查询

```sql
select * from emp_partition where month='202001'
union
select * from emp_partition where month='202002'
union
select * from emp_partition where month='202003'
```

## 2.3 UDF函数测试

### 2.3.1 Jar包上传

进入Scriptis页面后,右键目录路径上传jar包:

![img](../../../images/zh_CN/chapter3/tests/hive3.png)

测试样例jar包在`examples\ch3\rename.jar`

### 4.3.2 自定义函数

进入“UDF函数”选项(如1),右击“个人函数”目录,选择“新增函数”:

<img src="../../../images/zh_CN/chapter3/tests/hive4.png" alt="image-20200408214033801" style="zoom: 50%;" />

输入函数名称、选择jar包、并填写注册格式、输入输出格式即可创建函数:

![img](../../../images/zh_CN/chapter3/tests/hive5.png)

<img src="../../../images/zh_CN/chapter3/tests/hive-6.png" alt="image-20200409155418424" style="zoom: 67%;" />

获得的函数如下:

![img](../../../images/zh_CN/chapter3/tests/hive7.png)

### 4.3.3 利用自定义函数进行SQL查询

完成函数注册后,可进入工作空间页面创建.hql文件使用函数:

```sql
select deptno,ename, rename(ename) as new_name
from emp;
```
61 changes: 61 additions & 0 deletions docs/zh_CN/ch3/DSS_User_Tests3_SparkSQL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# DSS用户测试样例3:SparkSQL

DSS用户测试样例的目的是为平台新用户提供一组测试样例,用于熟悉DSS的常见操作,并验证DSS平台的正确性

![image-20200408211243941](../../../images/zh_CN/chapter3/tests/home.png)

## 3.1RDD与DataFrame转换

### 3.1.1 RDD转为DataFrame

```scala
case class MyList(id:Int)

val lis = List(1,2,3,4)

val listRdd = sc.makeRDD(lis)
import spark.implicits._
val df = listRdd.map(value => MyList(value)).toDF()

df.show()
```

### 3.1.2 DataFrame转为RDD

```scala
case class MyList(id:Int)

val lis = List(1,2,3,4)
val listRdd = sc.makeRDD(lis)
import spark.implicits._
val df = listRdd.map(value => MyList(value)).toDF()
println("------------------")

val dfToRdd = df.rdd

dfToRdd.collect().foreach(print(_))
```

## 3.2 DSL语法风格实现

```scala
val df = df1.union(df2)
val dfSelect = df.select($"department")
dfSelect.show()
```

## 3.3 SQL语法风格实现(入口函数sqlContext)

```scala
val df = df1.union(df2)

df.createOrReplaceTempView("dfTable")
val innerSql = """
SELECT department
FROM dfTable
"""
val sqlDF = sqlContext.sql(innerSql)
sqlDF.show()
```

Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

import com.webank.wedatasphere.dss.server.dto.response.*;
import com.webank.wedatasphere.dss.server.entity.*;
import com.webank.wedatasphere.dss.server.dto.response.HomepageDemoInstanceVo;
import com.webank.wedatasphere.dss.server.dto.response.HomepageDemoMenuVo;
import com.webank.wedatasphere.dss.server.dto.response.HomepageVideoVo;
import com.webank.wedatasphere.dss.server.dto.response.WorkspaceFavoriteVo;
import org.apache.ibatis.annotations.Param;

import java.util.List;

Expand Down Expand Up @@ -36,4 +41,11 @@ public interface WorkspaceMapper {
List<OnestopMenuAppInstanceVo> getMenuAppInstancesCn(Long id);
List<OnestopMenuAppInstanceVo> getMenuAppInstanceEn(Long id);

List<WorkspaceFavoriteVo> getWorkspaceFavoritesCn(@Param("username") String username, @Param("workspaceId") Long workspaceId);

List<WorkspaceFavoriteVo> getWorkspaceFavoritesEn(@Param("username") String username, @Param("workspaceId") Long workspaceId);

void addFavorite(DWSFavorite dwsFavorite);

void deleteFavorite(Long favouritesId);
}
Loading

0 comments on commit 9051b34

Please sign in to comment.