CLI for Indexing data from MSSQL to Elasticsearch

ElasticSearch is fantastic for indexing and filtering data. But hey, you have your data in a MS SQL Server database. How do you copy all data from the database to Elastic? And that too making sure that nothing goes wrong.

Don’t worry, this is indeed possible. We at appbase.io have made an awesome CLI tool called ABC which will allow you to do this with a single command.

abc import --src_type=mssql --src_uri= 

THAT’s it. Seriously, this is all you need to sync a MS SQL database to an ElasticSearch index. Here’s a video showing the process.

The Steps

The first step is to install ABC if you have not done so already. So go to the GitHub releases page for ABC and download the most recent version. It’s a single no-dependancy binary so put it anywhere you like. We recommended putting it inside a PATH directory so that it can be access from anywhere in the terminal.

Ensure that ABC is working by running the following command.

abc version

abc version output

Now, let’s take a MS SQL database and we are going to sync it to ElasticSearch index hosted on Appbase.io.

First, we are going to create a database called ‘abc-testing’. Then we are going to create a table called ‘users’ with the following schema.

CREATE TABLE users (
    email nvarchar(500) NOT NULL,
    name nvarchar(100) NOT NULL,
    bio text,
    PRIMARY KEY(email)
);

After that we will add some sample data to it.

insert into users(email, name, bio) values('[email protected]', 'Avi', 'Dev');
insert into users(email, name, bio) values('[email protected]', 'john', 'Cook');

The table looks like as follows now –

users table data

The MS SQL test source is now complete. It’s URL looks as follows

sqlserver://USER:[email protected]:PORT?database=DBNAME

Next we are going to create the sink ElasticSearch index. We go to appbase.io and create a new app called abcmssqltest . The complete URL to this index looks like the following.

https://USER:[email protected]/abcmssqltest

So now we have both the source and the sink ready. It’s time for some ABC magic. Using this source and sink, we can build the import command. It will be as follows

abc import --src_type=mssql --src_uri="sqlserver://USER:[email protected]:PORT?database=DBNAME" "https://USER:[email protected]/abcmssqltest"

Once you run this command, you should see that the command will finish in some time with no errors. Now if you visit appbaseio dashboard, you can see that the data has been transferred to the target ElasticSearch index.

appbaseio data view

Voila. Everything works. The data has been transferred to ElasticSearch and that too without doing anything at all. Next we will see how to transform and manipulate data as it goes from source to the sink.

Transforming Data before Indexing into Elasticsearch

There are times when you don’t need the data to go as it is from source to the sink. You might like to change the target type name (example — users to accounts) or you might like to remove certain fields (e.g. bio ) or create new fields. For all this, we have the transforms feature in ABC. It can be used by the transform_file parameter.

abc import --src_type=mssql --src_uri="sqlserver://USER:[email protected]:PORT?database=DBNAME" --transform_file="transform_file.js" "https://USER:[email protected]/abcmssqltest"

The transform_file parameter takes the path to the transform file. That file in turn contains the JavaScript transforms that should be applied to the pipeline. Let’s take the contents of transform_file.js as follows.

t.Source("source", source, "/.*/")
 .Transform(omit({"fields":["bio"]}))
 .Save("sink", sink, "/.*/")

In the above transform, you can see that we are going to omit the bio field from the data transfer. Now when we run the new import command, we should have the following result.

sink data after applying transform

As you can see, bio field was omitted when data reached the sink. More documentation on transform file can be found on GitHub. It supports lots of inbuilt functions like omit and even supports running custom JavaScript code as a transform. It’s a good idea to explore its documentation.

Summary

This article is part of the article series “How to index data from into ElasticSearch using ABC”. If you found this interesting, consider giving this article a :clap|type_1_2: so that others readers can find it on Medium too. ��

Further Reading

ABC’s README is a good place to start if you want to learn more about the application. You can also have a look at the MSSQL adaptor docs . Furthermore, you may :star:️ the repo on GitHub and watch it to stay tuned for updates.

稿源:All things #datastreams (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 后端存储 » CLI for Indexing data from MSSQL to Elasticsearch

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录