Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Key Points

...

However, if Spark is running on YARN with other shared services, performance might degrade and cause RAM overhead memory leaks. For this reason, if a user has a use-case of batch processing, Hadoop has been found to be the more efficient system.  

Improve RAJG data virtualization layers with data consumption methods

https://www.linkedin.com/posts/rajkgrover_data-datamanagement-banking-activity-7221140607228911617-hgeR?utm_source=share&utm_medium=member_desktop

add consumption models from sources, lakes

batch, transaction request, events, streams

add column for MDM, governance

Data Services Methods

https://www.linkedin.com/posts/giorgiotorre1234_how-many-api-architecture-styles-do-you-activity-7059072388340064256-TM9U?utm_source=share&utm_medium=member_desktop

...

https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_a_WebSocket_server_in_Java

Writing WebSocket servers](/en-US/docs/Web/API/WebSockets_API/Writing_WebSocket_servers

https://github.com/mdn/content/blob/main/files/en-us/web/api/websockets_api/writing_a_websocket_server_in_java/index.md?plain=1

https://github.com/mdn/content/blob/main/files/en-us/web/api/websockets_api/writing_a_websocket_server_in_java/index.md?plain=1

...

disributed-db-sharding-strategies1.pdf

This article looks at four data sharding strategies for distributed SQL including algorithmic, range, linear, and consistent hash.

Data sharding helps in scalability and geo-distribution by horizontally partitioning data. A SQL table is decomposed into multiple sets of rows according to a specific sharding strategy. Each of these sets of rows is called a shard. These shards are distributed across multiple server nodes (containers, VMs, bare-metal) in a shared-nothing architecture. This ensures that the shards do not get bottlenecked by the compute, storage, and networking resources available at a single node. High availability is achieved by replicating each shard across multiple nodes. However, the application interacts with a SQL table as one logical unit and remains agnostic to the physical placement of the shards. In this section, we will outline the pros, cons, and our practical learnings from the sharding strategies adopted by these databases.


Data Driven Organization Maturity Levels

...