Skip to content
This repository was archived by the owner on Apr 6, 2021. It is now read-only.

[api] method POST is used when synchronization of data should be done by PUT with explicitly indicated ID. #114

@gustawdaniel

Description

@gustawdaniel

I have a question to

https://crickapi.docs.apiary.io/#reference/crick-api-for-watson/frames/push-(not-yet-synchronized)-user's-frames

To synchronization is proposed POST. When POST is dedicated to creating resources.
For both updating and creating resources PUT should be used.
Full synchronization requires also answer for the following questions:

Lets A be a client (for example Watson CLI)
Lets B be a server (for example app.crick.io api)

Question 1. Which participant of synchronization contains a source of truth?
a) both have the same level
b) client
c) server

if a - both have equal value
Question 2. Which behavior should be considered as correct when two resources have other values but the same id. The data model of time frame does not contain last modification time. Even if, to do correct synchronization we need also background - previous common version. This is an open question. Related with lacking docs about synchronization.
Question 3. Should be allowed data deletion? If A has resource but B has not then synchronization means that resource should be added to B, or deleted from A? If we select adding strategy on how to remove the resource, is deleting strategy, how to add?
These not all questions but I do not have infinite time, so go to next possibility.

if b - client (Watson cli is master)
Question 4. Then it should start synchronization by get data from the server, process it by comparison with data in Watson, then send POST only to them that are not created on the server (not synchronized yet)

and it is a source of this question because we have an endpoint for both get all frames

and POST lacking frames

but it is an incomplete approach. What about update frames that change PUT / PATCH and remove frames that were removed DELETE. And what is a relation among taking the logic of synchronization in Watson CLI in relation to recommendation of @SpotlightKid from

jazzband/Watson#40

that in 2015 typed

To not bloat the Watson distribution with too many sync backends (and their dependencies), I propose to use a plugin framework to load backend implementations and to specify the API that they have to support.

Question 5. What is a scenario when in one backend it connected two clients? O one with data second without. Should synchronization with first create data on the server, and on the second remove? Taking into account that only GET and POST are implemented I suspect that rather, first synchronization creates data on the server, second move them to the second client, but when I remove the frame from the first client and synchronize again this frame rather will occur on the client that will be removed from the server. Should be it considered as a bug?

Actually "Watson deleted frames do not sync with crick"

#111

if c - the server is master, and cli slave
It is rather not probably because of synchronization means in this case that you can create data only on the server. But when I had seen issue

jazzband/Watson#171

I decided to add the next question
Question 6. Who is a person that has to decide voice on this topic? @jmaupetit typed

We must re-consider our synchronization strategy which —at the time of writing— overrides local changes between two sync events.

It is related with my question about integration with external sources of data that uses his own identifiers.

jazzband/Watson#190

It is related with not finished discussion about logic of synchronization there

Syncing with server overrides local changes #171

And lacking documentation there.

jazzband/Watson#165

I can send my propositions. What should I do?

  1. Do research about synchronization protocols [today]
  2. Propose protocol [today]
  3. Wait for an answer for question [1 month]
  4. Wrap everything together and publish a draft of the specification of synchronization [1 week]
  5. Wait for fixes and opinions from community [1 month]
  6. Learn Go + react, I know c, c++, python, vue, so it will be easy [1 month]
  7. Implement this specification [1 month]
  8. Wait for accepting pull request [1 month]

When everything will go great we will have working synchronization in half of 2019 and many issues connected with it will be closed.

So let's start.

  1. Research on synchronization:

https://en.wikipedia.org/wiki/Data_synchronization

We have

  • file synchronization
  • version control
  • distributed filesystems
  • mirroring

I propose version control.

set reconciliation problem can be solved by

  • Wholesale transfer
  • Timestamp synchronization
  • Mathematical synchronization

I poropose matchematical synchronization

In Error handling paragraph there is a sentence

The simplest approach is to have a single master instance that is the sole source of truth.

But I propose another approach - accept any modification and store list of modifications. When two modifications overlapping, then merge them with "mathematical synchronization" that I will describe later.

Proposed tools

http://thesecretlivesofdata.com/raft/

There is PDF

raft.pdf

and finally a list of implementations

https://raft.github.io/#implementations

So props:

  • has many implementations, are widelly known
  • works in a distributed network of nodes,

Questions:
should we consider Watson cli like rarf node or client?
Answer:
It could be node only if have a public address, but it is to send a request to them, but this is hard to achieve.
So Watson cli should be a client in this model.

Cons:

  • it seems to be overengineered.
  • it needs cluster of servers to works efficiently
  • we rather looking for simple sollutin like "storage everywhere", "server -> serverless"

I reseatrched some solutions and finally finised on stackoverflow asking this question

https://stackoverflow.com/questions/54385016/simple-synchronization-protocol-for-array-of-objects

This is instantly draft of my proposition how to solve problem of synchronization. It this model Serverless lambda + text file stored anywhere can be replaced by crick backend and postgress, but vision of serverless (that are free today for small number of requests) and static file storage (that is also free for personal users) for me is more attractive than backend that must be served.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions