GPT fine tuning - 공부 내용 - 아무튼-작업일지

자동화 툴 - 공부해보기

https://www.youtube.com/watch?v=ywH7JIK34Tg

yeji Kim

2024/09/29 3:25 PM

랜딩 페이지 만들기

https://www.youtube.com/watch?v=jnRqq2JEdxI https://www.youtube.com/watch?v=_6k1k7NtZRI Framer ai 프롬프트 사용법

yeji Kim

2024/09/29 2:17 PM

패턴러닝

패턴 러닝 !! template matching ... 궁금한 것들 패턴 매칭 ... 아 이거 진짜 간단한 건데. 어 ... 이거 LLM 못 시키나. 결국 LLM을 만들어야 ... 아니 하드웨어를 만들어야 ... 똑똑한 친구를 어떻게 현업에 적용할지 ...! ...!!...!! OCR한 결과, 문법 교정한 결과 제공 사용자에게 표시해달라 LLM한테 나눠달라 고쳐달라 오 이 피드백 과정을 좀 자동화할 수 있나 좋은데 오 개좋은데? 이거를... 이거를 헤헷 요렇게 피드백하는 것만이라도 왓다갓다할 . 수있게 하고 싶다!! 헤헤헿 이걸 하려면 어떤 함수들이 필요한지 !!!

yeji Kim

2024/09/23 4:56 PM

Large Language Models for Information Retrieval: A Survey

Introduction retrieval upstream - query reformulation downstream - reranking and reading reranking only on a limited set of relevant documents personalization, diversification Background Info retrieval relevance estimation - lexical similarity between the query and document vectors. Components Query rewriter Retriever Reranker - fine-grained reordering Reader - comprehend real-time user intent and generate dynamic responses Search agent LLMs Query rewriter Rewriting scenario

yeji Kim

2024/09/20 5:46 PM

Web 3.0

web 1, 2, 3 1.0 - 읽기 2.0 - 읽기+쓰기 3.0 - 읽기+쓰기+개인화(소유) 탈중앙화. 플랫폼 없어짐. 시맨틱 웹 컴퓨터가 사람 대신 정보를 읽고, 이해하고, 가공하여 새로운 정보 생성. 마크업 - XML, RDF 등 온톨로지

yeji Kim

2024/09/19 7:55 PM

ELK stack (Elastic search, log stash, kibana)

Elastic search - 데이터 저장, 검색 엔진 inverted index (key - word, value - doc) 특징 scale out - 샤드를 통해 수평적으로 규모를 늘릴 . 수있음 고가용성 - replica를 통해 데이터의 안정성을 보장 schema free - json 문서를 통해 데이터 검색을 수행. → 스키마 개념이 없음. rest ful - 데이터 crud 작업은 http restful api를 통해 수행하며 각각 다음과 같이 대응. ES의 검색 쿼리 컨텍스트 연관성을 계산해 최대한 비슷한 데이터를 찾아줌. BM25 필터 컨텍스트 Log stash - 데이터 수집 로그 - 반정형 데이터 로그 수집 후 로그 형태를 분석하고 정제하는 작업이 필요함. 특징 플러그인 기반 모든 형태의 데이터 처리 성능 - 자체 내장 메모리와 파일 기반 큐 사용 안정성 - 데드 레터 큐

yeji Kim

2024/09/19 7:49 PM

The organization of information

Organization of recorded information The nature of information 유용성 - data<information<knowledge<understanding<wisdom Organization of information in different contexts Libraries Descriptive cataloging creating a description choosing access point ensuring authority control Subjective cataloging conceptual analysis - aboutness. translation - aboutness → controlled subject language choosing controlled vocabulary terms choosing classification notations Retrieval tools The basic retrieval tools, their formats, and their functions Bibliographies list of resources Catalogs individual items within collections of information resources Indexes

yeji Kim

2024/09/19 3:49 PM

인지 부하 디자인

정보 시각화와 지식 시각화 정보 시각화 - 이해를 돕기 위해 지식 시각화 - 통찰력 전달, 새로운 지식 단위 생성 Hick의 법칙 선택 사항 최소화 복잡한 작업 단계 나누기 권장 옵션 강조 점진적인 보드 추상화 지점을 단순화 X 적절한 선택을 적시에 제공. https://tammist.tistory.com/46

yeji Kim

2024/09/19 11:50 AM

모델 용량 - 10억으로 나눈 뒤 데이터 타입 바이트 수 곱하기 e.g. 7B 모델 16비트(2바이트) → 7*2=14GB from llama_index.core import Document, VectorStoreIndex 상호 순위 조합 구현하기 BM25와 의미 벡터 검색 방법 벡터 데이터베이스 벡터 라이브러리 - Faiss, Annoy, NMSLIB, ScaNN 벡터 전용 데이터베이스 - pinecone, weaviate, milvus, chroma, qdrant, vespa 벡터 기능 추가 데이터베이스 - elasticSearch, PostgreSQL, MongoDB, Neo4j MLOps 데이터 준비 → 모델 학습 → (모델 저장소 ↔ 모델 평가) → 모델 배포 → 모니터링 → 재학습 ... 멀티 에이전트 - AutoGen, MetaGPT, CrewAI 사용자 맞춤형 정보

yeji Kim

2024/09/19 9:57 AM

DB table 구성하기

To efficiently implement the features you described, the database structure needs to account for several types of entities: lecture notes, textbooks, past exam questions, professors, and study tips. Additionally, the system should allow cross-referencing between these entities, version control for lecture notes, and the ability to filter data by professor. Here’s how you can structure the database tables: Key Tables for the Database Structure Subjects (subjects): Stores information about each subject. Professors (professors): Stores information about the professors. Lecture Notes (lecture_notes): Stores metadata for each lecture note document. Lecture Slides (lecture_slides): Stores content for individual slides within a lecture note. Textbooks (textbooks): Stores metadata about textbooks for each subject. Textbook Pages (textbook_pages): Stores the text content of each page in a textbook. Test Questions (test_questions): Stores individual test questions. Slide to Textbook Mapping (slide_textbook_mapping): Relates specific slides to textbook pages or paragraphs. Slide to Test Question Mapping (slide_question_mapping): Relates specific slides to test questions. Test Question to Textbook Mapping (question_textbook_mapping): Relates specific test questions to textbook pages or paragraphs. Study Tips (study_tips): Stores study tips from professors related to each subject. Versions (lecture_note_versions): Manages versioning for the lecture notes. Table Structures and Relationships Subjects Table (subjects) subject_id: INT (Primary Key) subject_name: VARCHAR(255) (e.g., "Biochemistry") description: TEXT (optional description) Professors Table (professors)

yeji Kim

2024/09/05 3:06 PM

OpenAI Embedding & Semantic search

https://platform.openai.com/docs/api-reference/embeddings

yeji Kim

2024/09/05 3:06 PM

OpenAI API

https://platform.openai.com/docs/guides/batch/getting-started Batch API 이용하기 Upload batch file Creating the batch Checking the status of a batch Retrieving the results 다른 LLM 알아보기 이걸로 기본적인 처리는 해야겠다!!! (와)

yeji Kim

2024/09/04 7:32 PM

GPT fine tuning

https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset https://www.datacamp.com/tutorial/fine-tuning-openais-gpt-4-step-by-step-guide finetuning GPT 4o jsonl format의 데이터를 만들어야 함. {"prompt": "<prompt text>", "completion": "<ideal generated text>"} {"messages": [{"role": "system", "content": "You are teaching assistant for Machine Learning. You should help to user to answer on his question."}, {"role": "user", "content": "What is machine learning?"}, {"role": "assistant", "content": "'Tis but the art of teaching machines to think, to learn from data most fine, and decisions to link."}]} {"messages": [{"role": "system", "content": "You are teaching assistant for Machine Learning. You should help to user to answer on his question."}, {"role": "user", "content": "How doth neural networks work?"}, {"role": "assistant", "content": "They're like the brains of yon digital beast, with layers many, and nodes that cease."}]} Upload training file Create a fine-tuned model Analyze fine-tuned model UI로도 관련 정보를 볼 수 있음. fine tuning 작업이 끝나면, job details의 'fine_tuned_model' 필드에서 모델 이름을 볼 수 있음. → 아래아 같이 모델 이름을 적고 활용하면 됨.

yeji Kim

2024/09/04 1:31 PM

Semantic search

ENHANCING KNOWLEDGE RETRIEVAL WITH IN-CONTEXT LEARNING AND SEMANTIC SEARCH THROUGH GENERATIVE AI Method 1 : Generative text retrieval (GTR) 각 chunk에 word2vec 등으로 Embedding → vector database 구축 query embedding과 유사도 계산 → 가장 가까운 걸로. Generative tabular text retrieval (GTR-T) 먼저 database table과 meta data를 가져와서 .csv로 저장. query를 embedding하여 관련있는 table을 찾음 이 table을 Llm한테 줘서 적절한 sql 쿼리문을 생성함. Olio: A Semantic Search Interface for Data Repositories Intro Q&A, exploratory search, design search. 태블로를 활용한 시각화 → 썸네일 제공? Related works Semantic web search system keyword(structured query languages) based or NL based keyword based QUERIX - stanford CoreNLP parser + wordNet olio는 trends, location, groupings, aggregations, filters 등으로 intentfmf qnsfbgka. 의도를 특정 그룹으로 나누는 것 같음. 내가 하려는 것과 잘 어울리는지는 모르겠어서 일단 읽기 중단. Know where to go : make llm a relevant, responsible, and trustworthy searcher. Intro

yeji Kim

2024/09/04 8:08 AM

하이퍼레저 패브릭 - 체인코드. - 스마트 컨트랙트 역할. 여러가지 플러그 기ㅏ능 옵션. 원장 데이터는 다양한 형식으로 저장. 컨센서스도. - kafka, raft. msp 목표 허가된 참여자 대상 모듈러 아키텍쳐 기반. 특징 퍼미션드 비결정적. 교체 가능한 모듈러 아키텍처 가능한 컨센서스 알고리즘 - solo, kafka, practical byzantine fault tolerant 멀티 블록체인 하나의 블록체인 네트워크를 논리적으로 독립된 여러 개의 블록체인으로 분할 가능. 순서화가 왜 필요하지? 컨센서스 알고리즘? 멀티 블록체인 지원? 컨소시엄 블록체인 - 허가된 기관만. 컨소시엄 소속 참여자가 관리 주체. r3cev, casper 프라이빗 - 허가된 기관만. 중앙 기관이 모든 권한 보유. hl fabric, eea

yeji Kim

2024/08/23 9:43 AM

하이퍼레저 패브릭 개발 - dapp 실행 플로우 (네트워크 구동)

https://www.youtube.com/watch?v=VAjOIZB4PVI 회사 컨소시움 구성 - a, b, c 회사 각 회사에서 peer 몇개 할지. a-peer 1, b-peer 2개 ... 회사마다 ca_a, ca_b, ca_c → root_ca cdb(couch db) orderer 기관 결정. (solo, kafka - 한개 이상, etcdraft - 비동기?) 채널이라는 소그룹 - 채널 트랜잭션 처리 hl fabric을 이용하면 여러 요소들을 가져다가 쉽게 네트워크를 구성할 수 있음. mongo DB -

yeji Kim

2024/08/21 5:59 PM

궁금한 것 보증 정책 : 주어진 스마트 컨트렉트에 의해 생성된 거래에 서명해야 하는 블록체인 네트워크의 조직 for 해당 거래가 유효하다고 선언 → 유효하다는 선언은 어떻게 하지? 유효 여부에 관계없이 모든 거래는 블록체인에 기록되지만 유효한 거래만 world state에 기여 ex 자동차 이체 거래 t3 : ORG1과 ORG2 사이의 자동차 환승에 대한 트랜잭션 입력 : {CAR1, ORG1, ORG2} 출력 : {CAR1.owner=ORG1, CAR1.owner=ORG2} => ORG1에서 ORG2로 소유자가 변경되었음을 나타내는 방법 애플리케이션의 조직 ORG1에서 입력이 서명 보증 정책 ORG1 및 ORG2로 식별된 두 조직에서 출력이 서명 서명은 개인 키를 사용하여 생성 네트워크의 모든 노드가 트랜잭션에 대해 동의, 네트워크의 모든 사람이 확인 가능 트랜잭션 : 두 단계의 검증 보증 정책에 따라 충분한 조직에서 서명했는지 확인 world state의 현재 값이 보증 피어 노드에 의해 서명되었을 때 트랜잭션의 읽기 세트와 일치하는지 확인 즉, 중간에 업데이트가 없었는지 확인 트랜잭션이 이 테스트를 모두 통과하면 유효한 것으로 표시됩니다. 조직이 서로 다른 상대방과 작업 트래픽을 분리하는 데 도움이 될 만큼 충분히 독립적 필요할 때 독립적인 활동을 조정할 수 있도록 충분히 통합됨. 대부분의 시나리오에서 위에서 설명한 체인코드 수명 주기 메커니즘을 사용하는 대신 체인코드에 초기화 논리를 포함 권고

yeji Kim

2024/08/21 1:12 PM

DB 변경 생길 때마다 알림 주기 - MySQL trigger, flutter cron

GPT To receive notifications when a change occurs in the user's schedule value in the database, you can implement several approaches depending on your specific needs and the technologies you're comfortable with. Here’s a high-level overview of the options: 1. Polling (Active Checking) How it works: You can create a script that periodically checks the database for changes. If a change is detected, the script sends a notification. Steps: Set up a CRON job (or Windows Task Scheduler if on Windows) that runs a script every X minutes. The script queries the database for changes in the user's schedule value. If a change is detected, send a notification (e.g., via email, SMS, or push notification). Pros: Easy to implement. Cons: Not real-time, may introduce unnecessary load on the server if the polling frequency is too high. 2. Database Triggers + Notification System How it works: Use a database trigger to detect changes and invoke a notification system. Steps: Create a Trigger: Set up a trigger in your MySQL database (if you're using MySQL with XAMPP) that activates when a change occurs in the user's schedule table. Trigger Action: The trigger could insert a record into a separate "notifications" table or call an external script to send a notification. Notification System: Use a script or service (such as an email or SMS API) to send notifications when new records are added to the "notifications" table. Pros: Real-time notifications, less overhead than polling. Cons: More complex setup, requires database-level modifications. Example of a MySQL trigger: 3. WebSocket or Server-Sent Events (SSE)

yeji Kim

2024/08/09 7:26 PM

피그마

frame 해상도 - 점유도 커뮤니티 ui KIT - duplicate 버튼생성하기 1-3, 프로토타이핑, text styling, auto layouthttps://www.youtube.com/watch?v=E4NfxpV9hpE 미드저니

yeji Kim

2024/08/09 12:32 PM

fabric application

about asset transfer 구성 요소 샘플 app smart contract 샘플 app 준비하기 npm install → 종속성 설치, 앱 빌드 1. gateway에 대한 gRPC 연결 설정 2. gateway 연결 생성 요구사항 fabric gateway에 대한 gRPC 연결 네트워크와 거래할 때 사용되는 client ID 디지털 서명 3. 호출할 계약에 액세스 gateway.getNetwork, network.getContract 4. 샘플 자산으로 원장 채우기 submitTranscation은 fabric gateway를 통해 다음을 수행 거래 제안 승인 승인된 거래를 주문 서비스에 제출 트랜젝션이 커밋되고 원장 상태가 업데이트될 때까지 대기 샘플 앱에 initLedger 호출. contract.submitTransaction

yeji Kim

2024/08/05 11:15 AM